4k €?
25 OCT 73%
s
NAVAL fi€S£flRCH
LOGISTICS
o _
o
)
SEPTEMBER 1973
VOL. 20, NO. 3
OFFICE OF NAVAL RESEARCH
LinJ'Fj NAVSO P1278
NAVAL RESEARCH LOGISTICS QUARTERLY
EDITORS
F. D. Rigby
Texas Tech. University
B. J. McDonald
Office of Naval Research
O. Morgenstern
New York University
S. M. Selig
Managing Editor
Office of Naval Research
Arlington, Va. 22217
ASSOCIATE EDITORS
R. Bellman, RAND Corporation
J. C. Busby, Jr., Captain, SC, USN (Retired)
W. W. Cooper, Carnegie Mellon University
J. G. Dean, Captain, SC, USN
G. Dyer, Vice Admiral, USN (Retired)
P. L. Folsom, Captain, USN (Retired)
M. A. Geisler, RAND Corporation
A. J. Hoffman, International Business
Machines Corporation
H. P. Jones, Commander, SC, USN (Retired)
S. Karlin, Stanford University
H. W. Kuhn, Princeton University
J. Laderman, Office of Naval Research
R. J. Lundegard, Office of Naval Research
W. H. Marlow, The George Washington University
R. E. McShane, Vice Admiral, USN (Retired)
W. F. Millson, Captain, SC, USN
H. D. Moore, Captain, SC, USN (Retired)
M. I. Rosenberg, Captain, USN (Retired)
D. Rosenblatt, National Bureau of Standards
J. V. Rosapepe, Commander, SC, USN (Retired)
T. L. Saaty, University of Pennsylvania
E. K. Scofield, Captain, SC, USN (Retired)
M. W. Shelly, University of Kansas
J. R. Simpson, Office of Naval Research
J. S. Skoczylas, Colonel, USMC
S. R. Smith, Naval Research Laboratory
H. Solomon, The George Washington University
I. Stakgold, Northwestern University
E. D. Stanley, Jr., Rear Admiral, USN (Retired)
C. Stein, Jr., Captain, SC, USN (Retired)
R. M. Thrall, Rice University
T. C. Varley, Office of Naval Research
J. F. Tynan, Commander, SC, USN (Retired)
J. D. Wilkes, Department of Defense
OASD (ISA)
The Naval Research Logistics Quarterly is devoted to the dissemination of scientific information in logistics and
will publish research and expository papers, including those in certain areas of mathematics, statistics, and economics,
relevant to the overall effort to improve the efficiency and effectiveness of logistics operations.
Information for Contributors is indicated on inside back cover.
The Naval Research Logistics Quarterly is published by the Office of Naval Research in the months of March, June,
September, and December and can be purchased from the Superintendent of Documents, U.S. Government Printing
Office, Washington, D.C. 20402. Subscription Price: $10.00 a year in the U.S. and Canada, $12.50 elsewhere. Cost of
individual issues may be obtained from the Superintendent of Documents.
The views and opinions expressed in this quarterly are those of the authors and not necessarily those of the Office
of Naval Research.
Issuance of this periodical approved in accordance with Department of the Navy Publications and Printing Regulations,
NAVEXOS P35
Permission has been granted to use the copyrighted material appearing in this publication.
SEQUENTIAL DETERMINATION OF INSPECTION
EPOCHS FOR RELIABILITY SYSTEMS WITH
GENERAL LIFETIME DISTRIBUTIONS*
S. Zacks and W. J. Fenske
Department of Mathematics and Statistics
Case Western Reserve University
Cleveland, Ohio
ABSTRACT
The problem of determining the optimal inspection epoch is studied for reliability
systems in which N components operate in parallel. Lifetime distribution is arbitrary, but
known. The optimization is carried with respect to two cost factors: the cost of inspecting
a component and the cost of failure. The inspection epochs are determined so that the
expected cost of the whole system per time unit per cycle will be minimized. The optimiza
tion process depends in the general case on the whole failure history of the system. This
dependence is characterized. The cases of Weibull lifetime distributions are elaborated
and illustrated numerically. The characteristics of the optimal inspection intervals are
studied theoretically.
1. INTRODUCTION
In the present study we investigate the problem of determining the optimal inspection epochs
of a reliability system which is comprised of N components, operating independently (in parallel)
and having the same lifetime distributions. The lifetime distribution is known. An inspector visits
the system at a predetermined inspection epoch and finds a certain number of components which
have failed. The exact times of failure are unknown. All the components which have failed during
the interval between inspections are replaced by new components. Components which have not failed
are left in the system. We consider two types of cost factors: (i) The cost of inspection, which depends
on the number of components in the system; and (ii) The cost of failure per unit time. This cost com
ponent measures the loss due to a failure of components. The objective is to determine an inspection
policy that would be optimal with respect to the criterion of minimizing the total expected (discounted)
cost for the entire future. However, since we are dealing with cases of general lifetime distributions
(not necessarily exponential) the dynamic programming solution is excessively complicated, even
in the truncated case (when the number of inspections should not exceed a prescribed bound). There
fore, we are considering in the present paper a sequential myopic procedure. Accordingly, after each
inspection the epoch of the next inspection is determined, as a function of the whole past failure
history of the system. The aim is to minimize the conditional expected cost per time unit from the
present time until the next inspection epoch. In the case of exponential lifetime distributions (constant
failure rates) the optimal inspection interval (time interval between inspections) does not depend
on the past history of the system. As shown in the present study if the lifetime distribution is not
exponential this dependence might be very strong, especially if N is not large and the lifetime distribu
377
378 S. ZACKS AND W. J. FENSKE
tion is of a decreasing failure rate (DFR). The dependence of the optimal inspection intervals on the
observed number of failures, and on the number of components that were replaced at previous in
spections and are still operating, will be explicitly characterized. We start in section 2 by formulating
the model and the associated distributions. In section 3 we develop a general formula for the sequential
determination of the optimal length of the inspection intervals. In section 4 we derive the corresponding
formulas for lifetime distributions of the Weibull family; and illustrate the process with a numerical
example. In section 5 we try to explain the complex process illustrated in the example of section 4
by further theoretical development.
There are numerous papers in the reliability literature on inspections epochs and optimal mainte
nance. For the general theory see chapter 4 of Barlow and Proschan [1]. Articles which are close to the
present study are those of Kamins [4], Kander [5], Kander and Naor [6] and Kander and Rabinovitch [7].
The present study provides further elaboration of a chapter in the thesis of Fenske [3]. The main dif
ference between the present study and the articles mentioned above is in the basic model. The present
study is concerned with multicomponent systems while the other studies treat the whole system as one
component. The study of Ehrenfeld [2] was based on a model similar to ours, but Ehrenfeld considered
the problem of determining the inspection interval for the estimation of the mean time between failures
in the exponential case.
2. THE MODEL AND ASSOCIATED DISTRIBUTIONS.
Consider a reliability system which consists of N, N 3= 1, components. These components operate
independently (in parallel). Let T designate the lifetime of a component. This is a random variable
having a known distribution function (c.d.f)F (t). We assume that F{t) is absolutely continuous, with a
positive density function f(t), < f(t) < 0, and F(0) = 0. We further assume that the expected value
of T, according to F(t) is finite. Let So = and let So < Si < S 2 < . . . < S m < • ■ ■ designate a
sequence of inspection epochs. Let J m (m = 1, 2, . . .) designate the number of components that failed
during the time interval (S m i, S,„). All the J m components are replaced at the inspection epoch S m .
The N — J m components which have not failed during (S m i, S,„) are classified into m disjoint subsets
Ah m) ,A[ m) , . . . ,A m "L\ • The subset A^ n) = 0, . . . ,m — 1) contains all the components that were replac
ed at epoch Sj and did not fail throughout the time interval (Sj, S,„). Let /^ m) designate the number of
elements of A ( f. Obviously, A*>"y l) C <4 ( "°and ra ( "] +1) *£ n l f for each; = 0, 1,. . . and m = j,j + 1
Letra < ^ ) = y m ,andn (m) = (rt ( ™ ) ,n ( 7 ) , . . .,n ( ™>) for each m = 0, 1, . . .;n ( °> = n.
If a component belongs to the subset A^'p then its conditional lifetime distribution at time t is:
(2.1) F<;<> (t) = P{T ^ t  Sj I T > S m  Sj}
r 0, lit ss S,„
= F(tSj)F(S m Sj)
lF(S m S } )
it >S,
In particular, F\^(t) — F(t — S m ), if t > S m and zero otherwise. The conditional densities of
U =T — (S m — Sj), corresponding to the life time T of a component which belongs to A ( "^ play an
important role in our procedure. We can call U the remaining lifetime. It a component is chosen at
SEQUENTIAL DETERMINATION FOR RELIABILITY SYSTEMS 379
random at time t = S, n + its remaining lifetime U has a conditional density function
MulS^n^^^S/r r^I5^ "
where S (M > = (S„. . .,S m ).
We notice that if Z" has a negative exponential distribution, i. e. ,f(t) — Ke~ xt , f3*0, for any 0<\<«>,
then A,„(«S (W) , n<"'») = /(u) for all m = 1, 2, . . . and a// (S (m >, n< w >). This is a well known property
of the negative exponential distributions. Let //,„(uS (m) , n ( ' w) ) designate the c.d.f. corresponding
to (2.2).
3. SEQUENTIAL DETERMINATION OF INSPECTION EPOCHS
We will consider in the present section the problem of deriving an inspection policy which attains a
certain economic objective. We assume therefore that the cost of inspecting the system is $C per inspec
tion, and on the other hand, if an element fails then the cost associated with its failure is $C/per time
unit. The inspection policy adopted here is the following. Given the history connected with the past m
inspection intervals, i.e., (S (w) , n (m) ) determine the (m + 1) st inspection epoch so that the average
expected cost per time unit of inspection and of failure, over the (m+1) st inspection interval will
be minimized. We remark in this connection that this policy is in essence a myopic policy, which
minimizes the expected time averagecosts for each inspection interval individually. A Dynamic
Programming determination of the inspection epochs could attain a more global optimization. How
ever, attempts at Dynamic Programming solutions lead to complicated sets of recursive functional
equations. The solution of these equations is generally very tedious. As will be shown later the suggested
myopic procedure is globally optimal if the lifetime distribution is exponential. In other cases of interest,
like the Weibull distributions the myopic procedure does not coincide with the global Dynamic Pro
gramming solution. A study of the relative efficiency of the myopic procedure is still under way.
Let A designate the length of the (m 4 1) st inspection interval. That is, A = S m +i — S m . Given
(S <m) , n (m) ) the conditional expected average cost per time unit, under A, is
(3.1) «„<* s<»>, o<«>) = §+ £ »<> /; (a  u) f ( _"; ( s;;is;S *■
Or in terms of the conditional distribution of the remaining lifetime U we can express (3.1) in the form,
(3.2) R,»(A; S< m >, n<'"») = 4°+ NC f H m (A\S^\ n< m >)  % f N f uh m (u\^'"\ n"">)rf«.
A A Jo
The optimal (m+1) st inspection epoch is defined as'S m+ i = S m + A , where A is a positive real value,
A, for which the infimum of (3.2) is attained.
Let
(3.3) fJim = y «/i m (uS"' , >, n {m) )du,
380 S. ZACKS AND W. J. FENSKE
be the expected remaining life, given (S (m) , n (w) ) According to the assumption of the previous section,
fx,„ < oo. Differentiating R,„( A; S ( '">, n'"' 1 ) with respect to A we obtain that if /a,„ =£ Co/NC/then A = ».
This is a case in which no more inspections are warranted. On the other hand, if ix m > Co/NCf, there
exists a unique solution, A , to the equation:
(3.4) P uM«S ( '"\ n"">)du = ColNC f .
We realize from (3.4) that S, n +i is a function of the statistic (S'"'*, n (m) ) of the system.
As we have already mentioned in case's of exponential lifetime distributions the optimal length
of the inspection intervals is the same for all m = 1, 2, . . .. If 6= X" 1 is the mean time between failures
(MTBF) in the exponential case then /t,„ = 6 for all m, and the condition for a finite A is that Co < NC/6;
i.e., the cost of inspecting an element is smaller than the expected cost of failure of an element. If
this condition is satisfied then by letting y = Co/NC/O, it is easy to show that
(3.5) A° =  x ;[4],
where Xy [4] designate the yfractile of a chisquare distribution with 4 degrees of freedom.
4. OPTIMAL INSPECTION EPOCHS FOR WEIBULL DISTRIBUTIONS
Suppose that the lifetime of an element, T, follows a Weibull distribution, with a density function
(4.1) f(t; 6, a) = (
0, if t =£
^r 1 exp{*"/0}, if t >0,
u
where a and 6 are positive real parameters. We notice that if < a < 1 then the distribution has a
decreasing failure rate (DFR), and if 1 < a < oo its failure rate is increasing (IFR). When a = 1 the dis
tribution is exponential. Given (S ( ' H) , n (m) ) the density function of the remaining life U assumes the
special form
(4.2)
1 m a
F«M"S<»",n<»»»)= ^2n^ e xp{(SmSj)"ld} ■  d (u + S m Sj)"i exp{ (a + S m Sj)"IB} ,
j=o
for s£ u *s oo. When m = (4.2) reduces to (4.1). Following the procedure given in the previous section
we realize that Si < oo if, and only if,
(4.3) C o /A/<C / 0'/«rQ+l)
0i/a Y( — l) is the expected lifetime. If (4.3) is satisfied then the optimal value of Si is
(4.4)
SEQUENTIAL DETERMINATION FOR RELIABILITY SYSTEMS
381
where G ' (y \ p, v), is the yfractile of the Gamma distribution G(p, ;<). with scale parameter p, and
where y = ClyNCf* 1 " f(  + lj). We notice that if 2/a is a positive integer; then
(4.5)
s, = {fxU 2 +•>']}
We determine now a general expression for the lefthand side of (3.4). According to (4.2),
rA i m
(4.6) u.h% o) (uS< m >, n< m >) du = fi £ ^ m) exp { (S m Sj) a l6}.
y I u(« + S m Sj) a ' exp { (tt + SmSj) /^} rf«.
By a proper change of variable, we obtain
(4.7) yf u(« + S ra S j ) a 'exp {(u + S„ i S j }«/0}^
As s +A)»/e / 1 \
= '" I [B^w^iSmSj)] exp {u;} rfw=0" u r — +1
J(S m Sjfl B \« /
[ft^>.«H^.'.^')
— (S m — Sj) exp I
(S,«Sj)"
exp
(5,,,Sj + A)"]
I
Substituting (4.7) into (4.6), we obtain that S,„+i=Sm + A, where A is the root of the equation:
< 4  8 ) ji 1 »r ^p { (s m Sj)"ie} \g ( Sm "^~ A)a ; 1, ^+ 1)
j=o L \
'(
a
[lexp { ± [(S m S J +A)«(S«S J )«]}
y is as before, and G{x; p, v) is the c.d.f. of G(p, v) at x.
382 S. ZACKS AND W. J. FENSKE
We notice that for m = the solution of (4.8) is reduced to the one given by (4.4). In figure 1 we
illustrate the solution of (4.8) for three Weibull distributions, where the nj m) sequences were generated
by Monte Carlo simulation. The cases under consideration have the following parameters: C/=$10.,
C o = $200 • N, 0=[hr] 100, and a = 3/4, 1, and 5/4. The case of a= 1 corresponds to the exponential
distribution with mean 0=100. According to (3.5) the optimal inspection interval for a= 1 is of length
[hr] 50 x\ [4] where y = Col W • C f = 0.2. One can find in any statistical tables that x 2 2 [4] = 1.65.
Hence, the optimal interval between inspections in the exponential case is of length 82.5 hours. The
case of a = 5/4 represents an IFR distribution. We see in Figure 1 that the optimal inspection intervals
are of length which vary very little around 59 hours. It is interesting to notice that in the present case of
an IFR distribution the optimal inspection intervals do not depend strongly on the number of components,
N, in the system. This is not the case when the Weibull distribution is a DFR (a = 3/4). As illustrated
in Figure 1 the optimal intervals for DFR distributions, as obtained from (4.8), are sensitive to N. When
N=10 there are considerable fluctuations of the solution of (4.8). When N = 100 these fluctuations di
minish. The general trend of growth in the length of the inspection intervals is, however, the same.
An explanation of this phenomenon will be provided in the next section. Finally we remark that the
a= 3/4
5: 80
V)
75 
70
(EXPONENTIAL)
N ARBITRARY
a = 5/4
5 10 15 20 25 30 35 40 45 50
INSPECTION NUMBER
Figure 1. Optimum inspection intervals for Weibull Distribution with C/=$10,C„ = $200N, and = 100 (hr)
SEQUENTIAL DETERMINATION FOR RELIABILITY SYSTEMS 383
numerical solution of Equation (4.8) in the case discussed here has been attained following the Newton
Raphson iterative corrections to an initial solution. For further details see Fenske [3].
5. FURTHER THEORETICAL CHARACTERIZATION OF THE SOLUTION
A characterization of the solution obtained from (3.4) is not a simple matter, since the inspection
epochs S2, S3, . . . are random variables depending on the random vectors n 1 '"' in quite a complicated
manner. We remark that the sequences {n{ m) : m—j, 7+1, • • •} are supermartingales, for each
j=0, 1,2,.. .; and Urn rcj" exists as shown. Furthermore, if S m _i— S m 3= A for every m then lim
m— *<» m— kc
n (m) = o f or eacn y. This property holds for any life time distribution F. In order to obtain certain theo
retical approximations to the distributions of the roots of (3.4) we consider a modified problem, in which
for each m = 0, 1, . . . the random variables (n[ m \ j=0, . . . , m) are replaced by some fixed non
negative values. More specifically consider the distribution of the random variable.
1 '" f A
(5.1) W^jj^nMllFiSmSj)] 1 ] uf(u + S m Sj)du,
in which the inspection epochs are predetermined fixed values. W m is the lefthand side of Equation
(3.4). Whenever Si, S 2 , . . . are fixed inspection epochs the vector (/ij ( m) , n (m) , . . ., n {m) ) has for each
m=l, 2, ... a multinomial distribution, with parameters N and (fl*'"'; ; = 0, . . ., m), where ^' n)
m
is the probability that an element belongs to /4< m) , ^ &.' n) = 1. The probabilities fr. m) can be determined
j=o
recursively according to the following formulae:
(5.2) 8?»=lF(S m )
and
3
! =
0(m)=h^dUA[lF(S„Sj)],j=l, . . .,m.
It follows that for any fixed sequence of inspection epochs and for each m = 0, 1, . . .
(5.3) ;'=0, 1, . . .,m
Vari^'HA^U^"")
and
(5.4) cov (nj" , 4 m >) = N^ n) e^"\ a\\0^ j<k^m.
From (5.2) and (5.3) we conclude that if the length of each inspection interval is not smaller than A then
for any distribution F, lim n ( . m) — for each).
384 S. ZACKS AND W. J. FENSKE
The variable W m is a linear combination of multinomial random variables. Its expectation is
(5.5) u>„, = E{W m } = D^ + 2 1£0U> DW,
j=i L ,'_n J
where
(5.6) Z)('»> = j uf (u + S„,  Sj) rfa.
The variance of JF,,, is
_i f  0j m) (0j m) ) 2 / . ^'"W"') V]
(5.7) Var{r '" } "iV; ) [lF(S„ 1 S J )]^^5lF(S m S j )j]'
We have shown that for any sequence of inspection epochs Var {W m } = 0(N~ 1 ) as N — » ». This explains
why the fluctuations of the roots of (4.8) are relatively large when N= 10 and small when N = 100. We
consider now a particular sequence of inspection epochs which consists of values of S m obtained by the
repeated solution (for each m) of the equation (o m = y, i.e.,
r& in r j\ n rA
(5.8) uf(u)du+\ lV0/i> uf(u + S m Sj)du = y.
h j=i L i= JJ( »
Si is the root A of I uf(u)du = y, and for each m= 1, 2, . . ., the (m+ l)st inspection epoch is given
Jo
by Sm+i = Sm + A. The sequence of fixed inspection epochs determined by this procedure corresponds
to the expected values of n im) and we therefore label this procedure as the Procedure Of Averages. In
Table 1 we provide the inspection intervals determined by the Procedure of Averages, and the corre
sponding multinomial probabilities 0*." 1 * (/— »' • • •' m i) ■< f° r the two cases represented in Figure 1. The
graph of the corresponding inspection intervals for the case of a = 3/4 (DFR) is also plotted in Figure 1.
As is demonstrated in Table 1, in the IFR case (a = 5/4) the significant contribution to the solution is
expected to be that of n^"' and n ( ^"J,, or of their corresponding expected values. Furthermore, the
optimal length of the inspection intervals varies very little with the number of inspections, m, and its
expectation reaches in the present example a stable situation after two inspections. This is not the
case, however, in the DFR distribution (a = 3/4). The probabilities "' approach zero, as m grows, very
slowly. This is reflected in a steady increase in the length of the inspection intervals as m grows, and a
stable situation is reached in the present example only after 10 inspections.
To insure that the inspection intervals discussed in sections 3 and 4 will have similar properties
to those determined by procedures of fixed inspection epochs we could consider the following adjust
ment. First, determine for each m—1, 2, . . . two fixed sequences of inspection epochs which will
constitute upper and lower (confidence) limits for the solution of (3.4) (or (4.8)). This can be done by
utilizing formulae (5.5) and (5.7). The lower confidence limits could be obtained by repeated solution
(for the root A) of the equation
SEQUENTIAL DETERMINATION FOR RELIABILITY SYSTEMS 385
Table 1. Values of optimal inspection intervals A [hr] and multimonial probabilities under the
Procedure Of Averages for Weibull distributions with 6=100 [hr] and cost components C —
$200N, C f = $10
CaseI:a = 5/4(IFR)
m
opt. A
7=0
7=1
7=2
7 = 3
7 = 4
7 = 5
7 = 6
7=7
7 = 8
7 = 9
7=10
1
58.0
0.2017
0.7983
2
59.3
0.0211
0.1543
0.8246
3
59.1
0.0016
0.0161
0.1600
0.8223
4
59.1
0.0010
0.0013
0.0167
0.1595
0.8225
5
59.1
0.0001
0.0013
0.0166
0.1595
0.8225
6
59.1
0.0001
0.0013
0.0166
0.1595
0.8225
Case II: a = 3/4 (DFR)
1
147.6
0.6548
0.3452
2
156.8
0.4825
0.2216
0.2959
3
161.8
0.3666
0.1624
0.1880
0.2830
4
164.8
0.2839
0.1231
0.1372
0.1787
0.2771
5
166.6
0.2229
0.0953
0.1039
0.1302
0.1742
0.2735
6
167.9
0.1769
0.0748
0.0803
0.0984
0.1267
0.1716
0.2713
7
168.8
0.1416
0.0594
0.0630
0.0761
0.0957
0.1246
0.1698
0.2698
8
169.4
0.1142
0.0475
0.0500
0.0600
0.0739
0.0941
0.1233
0.1686
0.2686
9
169.9
0.0927
0.0383
0.0401
0.0473
0.0580
0.0726
0.0930
0.1223
0.1678
0.2678
10
170.3
0.0756
0.0311
0.0323
0.0379
0.0460
0.0569
0.0718
0.0924
0.1216
0.1671
0.2673
(5.9)
a>„, + 3. [War {W m }] m = T, m=l,2, . . ..
The upper limit can be obtained by solving the equation
(5.10)
o> m 3. [Var{r m }]" 2 = y, ro=l,2
In the second phase of computation solve Equation (3.4). If the solution lies between the roots
of (5.9) and (5.10) proceed; otherwise truncate the solution to either the lower limit or to the upper limit,
whichever is closer to the actual solution. Such an adjustment will guarantee that every inspection
interval will be bounded by lower and upper values which are determined by fixed sequences of in
spection epochs, and will therefore have general characteristics as established here.
REFERENCES
[1] Barlow, R. E. and F. Proschan, Mathematical Theory of Reliability (John Wiley and Sons, New
York, 1967).
[2] Ehrenfeld, S. "Some Experimental Design Problems in Attribute Life Testing," J. Am. Stat. Assoc,
57,668679(1962).
[3] Fenske, W. J., "Optimal Inspection Epochs For Reliability Studies" Ph.D. Dissertation, Department
of Mathematics and Statistics, Case Western University (1972).
386 S. ZACKS AND W. J. FENSKE
[4] Kamins, M., "Determining Checkout Intervals for Systems Subject to Random Failures," The Rand
Corporation, Paper RM2578 (1960).
[5] Kander, Z., "Inspection Policies of Deteriorating Equipment Characterized by N Quality Levels,"
TechnionIsrael Institute of Technology, Operation Research Monograph No. 93 (1971).
[6] Kander, Z. and P. Naor, "Optimization of Inspection Policies by Dynamic Programming, Technion
Israel Institute of Technology, Operation Research Monograph No. 61 (1970).
[7] Kander, Z. and A. Rabinovitch, Maintenance Policies When Failure Distribution of Equipment is
Only Partially Known, TechnionIsrael Institute of Technology, Operations Research Monograph
No. 92 (1972).
AN EMPIRICAL BAYES ESTIMATOR FOR THE SCALE
PARAMETER OF THE TWOPARAMETER
WEIBULL DISTRIBUTION
G. Kemble Bennett
Virginia Polytechnic Institute and State University
and
H. F. Martz
Texas Tech University
ABSTRACT
An empirical Bayes estimator is given for the scale parameter in the twoparameter
Weibull distribution. The scale parameter is assumed to vary randomly throughout a se
quence of experiments according to a common, but unknown, prior distribution. The shape
parameter is assumed 'to be known, however, it may be different in each experiment. The
estimator is obtained by means of a continuous approximation to the unknown prior density
function. Results from Monte Carlo simulation are reported which show that the estimator
has smaller meansquared errors than the usual maximumlikelihood estimator.
INTRODUCTION
A large number of authors have considered estimation of the parameters of the Weibull distribution
by the method of maximum likelihood, the method of moments, and numerous other classical tech
niques. Frequently, however, the parameters of the Weibull distribution are subject to random variation
and an analysis which encompasses this feature is best suited. Such an analysis has been performed
by Soland for the cases where the scale parameter is treated as a random variable [5] and where both
the shape and scale parameters are treated as random variables [6]. These approaches exhibit a Bayesian
viewpoint as adequately described by Raiffa and Schlaifer [4]. Emphasis is placed on determining con
jugate prior distributions and on performing both terminal and preposterior analysis. In this paper we
obtain empirical Bayes estimates for the scale parameter. This approach, like the Bayesian approach,
allows for the assumption of a randomly varying scale parameter. The analysis, however, does not
require any specific assumptions as to the distributional form of this parameter. Since this distribution
generally remains unknown, the empirical Bayes approach can, in a large majority of cases, be success
fully applied. Application would certainly be warranted, for example, in reliability life testing situations
where the lifetime distribution of items subjected to routine testing is adequately described by a Weibull
disbribution but where the scale parameter varied from testtotest.
THE EMPIRICAL BAYES APPROACH
Consider the situation in which we observe a value t (which may be vector valued) from a Weibull
density function given by
(1) /(tA) = A#0> **'*'
387
388 G: K. BENNETT AND H. F. MARTZ
and must estimate tiie parameter X with small squarederror. The shape parameter, /3, is assumed to
be known, however, the scale parameter, X, which determines t, is itself assumed to be a realization of
an unobservable random variable. Furthermore, it is assumed that this estimation process is a routinely
reoccurring situation. Therefore, as the process is repeated we obtain a sequence of realizations of
independent and identically distributed random variables t\, r 2 , . . ., t n  Our problem is to determine
an estimator X„ = X„Ui, . . ., t„) which minimizes E(k„ — k n ) 2 , where the expectation is taken with
respect to all the random variables involved and where X„ is the rath or current realization of X. Since k
is itself a random variable this minimizing estimator is well known to be the Bayes estimator, the mean
of the posterior distribution. This estimator can be represented by
fkf{t\\)g(k)dk
(2) MM*) J/(,A)s(X)dX'
where g(k) is the true prior density function of X.
Since the prior density usually remains unknown in practice, the estimator, E{k\t), cannot be
exactly determined. It can, however, be approximated using the information, t x , t 2 , . . ., t„, obtained
from previous realizations of X. Such an estimator is commonly referred to as an empirical Bayes esti
mator. For a detailed discussion of the empirical Bayes approach the reader is referred to, for example,
Maritz [2].
To illustrate this situation, consider a repetitive testing program in which the timetofailure density
of tested items is given by (1). During each test a sample of k failure times is observed from (1) and an
estimate of X is to be given. For example, at the first test a sample of k failure times is recorded and an
estimate of Xi is required. At the second test an additional random sample of k failure times is obtained
and an estimate of k>, which may be different from X x , is required. This situation continues until the
present or rath test is completed, at which time an estimate of X„ is to be given. Due to changing en
vironmental conditions from test to test, imperfect testing equipment, interactions of population
components, etc., the values Xi, X^, . . ., X„ are not likely to be equal, but to vary unpredictably and
thus randomly. This variation can therefore be described by a prior density function. However, since
the values of X remain unknown specification of g(k) can often be risky. In the situation described here
the observed experimental data, t t . t>, . . . , t„, are used to approximate g(k) , thereby relieving the ex
perimenter of the task of specifying the form of g(k).
DEVELOPMENT OF THE ESTIMATOR
Suppose that at each experiment j comprising a testing program a random sample from (1) is taken
and a maximum likelihood estimate
(3) **=*/£ *1
i=l
formed. Then the sequence of estimates
(4) Xfc, i, Xfc,' 2 , • • ., Xfc.n
provides a source of information on the past behavior of X. Based on this sequence of linear transforma
EMPIRICAL BAYES SCALE PARAMETER ESTIMATOR 389
tion can be performed which yields a new sequence of values
(5)
j , A 2 , . . ., A„ ,
which when considered collectively have a mean and variance approximating those of the random
parameter A. This sequence can now be used to approximate the prior density function, g( A). Proper
substitution of the approximation into (2) will yield an empirical Bayes estimate for A„, the nth or present
realization of A..
The particular density approximation chosen is described by Parzen [3]. He presents a consistent
density estimator of the form
(6)
S" (x) =7sWi r (w)'
where W{) is a weighting function satisfying certain boundedness and regularity conditions, and
h ( n ) a smoothing constant so chosen that lim h ( n ) = and lim nh ( n ) = °° These restrictions are placed
on h(n) to assure the consistency of g n (k). Parzen also lists several possible representations for
»W{ • ). Using these results Bennett and Martz [1] suggest the replacement of each unobservable A, in
(6) by its corresponding transformed estimate kf and form the density approximation
Subsequent substitution of (7) into (2) yields the empirical Bayes estimator
m e (Kit) lA£Mh)£n{h)d±.
The maximum likelihood estimator, A*, given by (3) is well known to be both consistent and
sufficient for estimating A and can be easily shown to have the conditional density function
(9) f(\k\^) = [kklk k ] k+1 exp[k\l\ k ]ir(k)\k
with mean and variance given by
(10) E(k,c\k) = kkl(k1) andVar (k k \k)= (kk) 2 l(kl) 2 (k2),
respectively. Since the maximum likelihood estimator is sufficient for estimating A the Bayes estimator
E{k\t), as defined by (2) can be conveniently written as £"(X X>t) •, and the corresponding empirical
Bayes estimator becomes
(11) E„(k\k k )= 7 7 ,
ff(k k \k)g n (k)dk
where /(Afr A) and£„(A) are given by (9) and (7), respectively.
390 G. K. BENNETT AND H. F. MARTZ
Since the actual range of A remains unknown the region of integration in (8) and (11) must be
determined empirically. This can be satisfactorily resolved by taking the region of integration to be the
observed range of estimates upon which the prior density approximation is based. Thus, it is only
necessary to order successively the estimates Ai, k 2 , . . . , A,, to obtain the region of integration. Alter
natively, the positive half of the real line could be used.
Let us now consider the linear transformation of k k defined by
(12) k* = C t ' 2 [k k E(k k )} + E(k),
where
(13) C = Var (A) /Var (A*).
The mean and variance of A* are easily verified to be equivalent to those of the random variable A, e.g.,
E(k*) = E(k) and Var (A*) = Var (A). If the mean and variance of A were known then the transforma
tion could be applied to each of the maximum likelihood estimates of sequence (4) forming sequence (5).
Since the mean and variance of A generally remain unknown, estimates of these quantities are required.
Using relationships of conditional probability, we have from (10) that
(14) E(k k ) = E[E(k k  A)] = [kl(k  l)]E(k)
and that
(15) Var (A*) = Var [E(k k  A)] + £[Var (\ k  A)]
= [*/(*l)] 2 Var (A) + [kl(kl) 2 (k2)]E 2 (k).
From (14) the prior mean can be consistently estimated by
(16) E(k) = [(kl)lk] An,
i  1 ^ *
where A„ denotes the sample mean  V k k , ,. If in (15) Var (k k ) is replaced by the sample variance,
n A
S„ = ]£ (A*,,A n )2/n, E 2 (k) is replaced by the relation, E(k 2 ) =E 2 (k) + Var (A), and E(k) is re
i=i
placed by A„, then upon solving for VAR (A ) , we obtain
(17) Vm (k) = [(k2)Slkf,] (kl)lk*
as an estimate of Var (A). Proper substitution of the above results into (12) yields
(18) k* = Oi*[k k k n ] + [(kl)lk]k n ,
where
EMPIRICAL BAYES SCALE PARAMETER ESTIMATOR 391
(19) C ={k\)lk[(k2)KISl}.
Thus, the transformation defined by (12) is completely determined and the empirical Bayes estimator
given by (11) can be formed.
MONTE CARLO SIMULATION
To ascertain the usefulness of the empirical Bayes estimator, E„(\\\k), Monte Carlo simulation
was employed on a UNIVAC 1108 computer. The criterion of comparison chosen was meansquared
error, and the widely utilized maximum likelihood estimator was the measurement reference. There
fore, the ratio
empirical Bayes meansquared error
(20) R = : '
maximum likelihood meansquared error
was of interest.
In the simulation, a value of X was randomly generated from a chosen prior density function
selected from the Pearson family of distributions [7]. Then a random sample ti, t 2 , . . ., r* of size k,
corresponding to the realization X was obtained from (1). The maximum likelihood estimate Xa was
then computed from (3) and its squared deviation (X — Xa) 2 from the corresponding realization of
X was calculated. For the second experiment, a new value of X was generated and the process repeated,
obtaining Xa and its squared deviation. For this experiment, Zs2(XXa) and its squared deviation,
[X — £" 2 ( X  Xa ) ] 2 , from the corresponding realization of X were also calculated. This was repeated 20
times, and each time, Z?„(XXa) was calculated using the present Xa, « as well as all previous maximum
likelihood estimates. Five hundred repetitions of this run of 20 experiments were then made, and the
averages of the squared deviations of Xa and E,,{^\^k) were formed as estimates of E(k — Xa) 2 and
E[k — £" ( X  Xa ) ] 2 , respectively. Then the ratio R was calculated utilizing these estimated meansquared
errors. All numerical integrations were performed by means of the 11point Gauss quadrature formula
and the weighting function W {•) in (7) was taken to be
r ( r) = [^]',
where
F=(XX*)/2/i(") and h(n) = n 1 ' 5 .
This procedure was repeated for all types of Pearson prior distributions and the ratio R was
observed to be significantly influenced by the prior distribution only through the ratio of the conditional
variance of the maximum likelihood estimator to the prior variance of X. This value can be represented
by
(21) Z= k2EHk)
(*l) 2 (k2) Var (X)'
where E(\) , the prior mean of X, has been substituted for X. Since the only factors affecting the ratio
R, apart from the number of experiences, are contained in (21) this quantity can be conveniently used to
summarize and index a given situation.
392
G. K. BENNETT AND H. F. MARTZ
It was generally observed in the simulation that the ratio/? varied only slightly for a given value of n,
providing the value of Z remained invariant from distribution to distribution. These results indicate
the robustness of the smooth empirical Bayes estimator to the form of the prior distribution. Also, it
was observed that as Z increases, the values of the ratio R decrease. This phenomenon is best understood
by considering the summary quantity Z. If Var (\a  M is large as compared to Var (A) , then the maxi
mum likelihood estimate of A will vary widely. The empirical Bayes estimator, however, is capable of
detecting this variation and can use this information to obtain better estimates of A. Conversely if Var
(A*  A) is small as compared to Var (A), then the maximum likelihood estimator would be expected to
do quite well. In this case there is a great deal of information within an experiment, and previous
experiments contribute very little information about the parameter. This improvement, however, never
surpasses that achieved by the empirical Bayes estimator.
Values of/? are plotted, in figure 1, as a function of n, the number of past experiences, for different
values of Z ranging from 0.5 to 5.0. For ease of presentation curves have been smoothed through the
actual data points.
10
09
 1
08
 1
07
 \\
06
~~—  Z = l.0v
5
Z =05
04
03
_ Z =25
_ Z = 5.0
02

01
i i
1 1 1 1
i i i
6 8 10 12 14
NUMBER OF EXPERIENCES
16
18
20
FIGURE 1. Ratio of the average squarederror of E„(\\ kit) to the average squarederror of \a for several values of Z.
Figure 2 illustrates a typical comparison between the improvements realized from using the linear
transformation defined by (18) and by not incorporating this feature into the analysis. The dotted line
represents the ratio R formed with an empirical Bayes estimator whose prior density approximation
is directly based on the sequence of maximum likelihood estimates (4). The solid line represents the
ratio R formed with the empirical Bayes estimator as defined by (11). Here the prior density approxi
mation is based on the transformed sequence given in (5). Both ratios illustrate the improvements
over the maximum likelihood estimator achieved by both empirical Bayes estimators. Note, however,
that the solid line is significantly lower than the dotted line. This result was repeatedly reproduced
in the simulation for all values of Z considered.
EMPIRICAL BAYES SCALE PARAMETER ESTIMATOR
393
1.0

0.9

k
0.8

Y\
07

\\
0.6
IT
I"
0.4
^^^^_
03
^~~~ __
02
0.1
1 1 1 1 1 1 1 1 1
6 8 10 12 14
NUMBER OF EXPERIENCES
16
18
Figure 2. Typical comparison of the ratio R with and without the linear transformation on Ki,; (
20
) with, ( ) without.
REFERENCES
[1] Bennett, G. K., and H. F. Martz, "A Continuous Empirical Bayes Smoothing Technique," Biometrika,
59,2,361368,(1972).
[2] Maritz, J. S., Empirical Bayes Methods (Methuen and Co., Ltd., London, England, 1970).
[3] Parzen, E., "On Estimation of a Probability Density Function and Mode," Ann. Math. Statist.
33,10651076(1962).
[4] Raiffa, H. and R. Schlaifer, Applied Statistical Decision Theory (Harvard Graduate School of Busi
ness Administration, 1961).
[5] Soland, R. M., "Bayesian Analysis of the Weibull Process With Unknown Scale Parameter and Its
Application to Acceptance Sampling," IEEE Trans. Reliability Rl 7, 8490 (1968).
[6] Soland, R. M., "Bayesian Analysis of the Weibull Process With Unknown Scale and Shape Param
eters," IEEE Trans. Reliability R18, 181184 (1969).
[7] Thomas, D. G., "Computer Methods for Generating PseudoRandom Numbers from Pearson Dis
tributions and Mixtures of Pearson and Uniform Distributions," Unpublished Master of Science
Thesis, Virginia Polytechnic Institute and State University (1966).
OPTIMAL ALLOCATION OF UNRELIABLE COMPONENTS FOR
MAXIMIZING EXPECTED PROFIT OVER TIME
Claude G. Henin
Faculty of Management Sciences
University of Ottawa
Canada
ABSTRACT
In the present paper, we solve the following problem: Determine the optimum redun
dancy level to maximize the expected profit of a system bringing constant returns over a
time period T; i.e., maximize the expression P I Rdt — C, where P is the return of the
Jo
system per unit of time, R the reliability of this system, C its cost, and T the period for
which the system is supposed to work.
We present theoretical results so as to permit the application of a branch and bound
algorithm to solve the problem. We also define the notion of consistency, thereby determin
ing the distinction of two cases and the simplification of the algorithm for one of them.
1. INTRODUCTION
In [4] we described different methods for solving the following problem. A serial system made of
n independent stages, has a reliability of R and a cost of C, for a mission of a certain duration. If the
system functions throughout the whole mission, the resultant revenue is P dollars. The problem there
fore was to maximize the expression PR — C, where R and C are increasing functions of the number
of standby units at each stage.
In this paper, we will consider a similar but more practical problem: we will suppose that, when
working, the system produces certain revenue per unit of time. This seems to be more likely to happen
in real life problems. For example, let us consider the orbiting of a commercial satellite or the place
ment of a submarine cable, which are sources of continual revenue as long as they function properly.
In both cases, the reliability of the system can be increased significantly through redundancy before
the system begins operations; however the system can only be repaired with difficulty once it fails.
The reliability of the system is equal to the product of the reliabilities, Ri, of each stage i. At each
stage, we have nti components; i.e., one basic unit and (mj — 1) standbys. At each instant t* the relia
bility Ri is an increasing function of mj, the number of components.
The cost of a stage is micu where c t is the acquisition cost of one component of type i, and the
system returns a net revenue of P per unit of time, while functioning. The problem is to maximize
the profit for a period T (where T can be infinite); i.e., to maximize the expression:
[T n
(1) P\ R(mi, ...,m»;t)dt— V cum,
Jo i? x
n
where R(m t , . . ., m n ; t) = FJ[ Ri(mu t).
*The reliability Rt(t) of stage i is the probability that stage i is still working by time t. We neglect the influence of switching
devices which could diminish the reliability of each stage n, if m becomes too large.
395
396 C. G. HENIN
The functions /?j(m,, t) are nonincreasing in t because the reliability of each stage cannot increase
with time.
Another problem arises if we suppose that the returns are discounted at a rate r. In this case, the
problem becomes
(2)
Ct n
max P I e~ rt R(mi, . . ., m n \ t)dt — \ cm
But it is clear that (2) is identical to (1) with the factor e~ rt included in R; i.e., if we replace R in (1) by
R' — e~ r 'R. For computational reasons, it is easier to replace each /?,•(£ = 1, . . ., n) by R' = e~ rtln R'.,
which has the same effect as replacing R by /?'. With this last transformation the treatment of the two
problems is identical and we shall restrict ourselves to the analysis of the first one. We shall analyze
the properties of function (1) and indicate how this problem can be solved with the help of the methods
described in [4J.
2. CONSISTENCY
In order to analyze the function (1), we must define an important property. Let m denote the
n vector m u . ■ ., m„, and let us consider a subset S of indices among all the possible m. This subset
S is said to be consistent with respect to the failure law of the component and the structure of the
system if the following property holds: if for some and for some m, m' tS, t «£ T,R (m, t ) > > R (m' , t,
then R(m, t' ) ^ R{m , t') for all t' ^ T. Consistency for a set S implies that the reliability orderings
among all the vectors belonging to S remain constant between zero and T.
By taking T sufficiently small and by taking an upperbound N on the number of components at each
stage i, it is always possible to find consistent reliabilities. Indeed, if we have n stages, we have at
most N" reliability functions. As this number is finite, it is always possible to find consistent reliabili
ties for [O, 7'], where 7' is smaller than the smallest positive intersection point of these N" functions.
It seems impossible to provide general necessary and sufficient conditions to insure consistency
on the whole set of reliabilities. However, if the reliability function at a given stage Ri(m, t) can be
written as Pi{m)gi(t), then the reliabilities are consistent.
On the other hand, it is easy to find nonconsistent reliability functions. Numerical tests show that
loaded standbys with exponential failure laws do not show consistent reliabilities for general values
of T. As another case, consider a two stage system with components in a loaded mode and having a
linear failure law; i.e., the reliability of the system is: R= [1— (Xi7') m i] [1 — (kit)™ 2 ]. We shall show
that for \i 3= A 2 , we can find two reliability curves which intersect each other. It is sufficient, for ex
ample, to take m= (wi], m 2 ) = (4, l)m' = (m[, m' 2 ) = (2, 2). T must be less than 1/Ai. For t > X2/A1,
/?(m, l) is larger than R(m', t), but for t < A 2 /M, it is smaller. Therefore, for A.2/M < T< 1/A.i, the
two reliability curves intersect each other and are inconsistent. For T smaller than A^/A,, they do not
intersect and are consistent. For larger number of stages, it becomes very difficult (computationally)
to see if the reliability curves intersect each other.
Therefore, the problem exposed in this paper is more difficult than the one solved in [4]. For ex
ample, generally there is no sense in creating an undominated sequence of allocations at a given time
as in [5]. However, if we compute an undominated sequence at a given time, and if the terms (i.e.,
the vectors m of this optimal sequence) are consistent for all t between and T for all possible se
quences, then we have the following theorem:
ALLOCATION OF UNRELIABLE COMPONENTS 397
THEOREM I: If the set of vectors m of an undominated sequence at a given time, t, is consistent
(i.e., if the undominated sequence remains the same for all t) then the optimal solution to (1) corresponds
to a term of this undominated sequence.
PROOF: Suppose the contrary and that the optimal solution is given for a vector m not belonging
to the optimal sequence and with a cost of C. Consider the two successive terms in the optimal sequence
of cost C and C" such that C < C «£ C". Then by definition of the optimal sequence and consistency,
we have that the reliability R(t) of our solution is always smaller than the corresponding reliability
R'(t) of the term costing C in the optimal sequence. Therefore, P J R(t)dt — C is smaller than
P I R' (t)dt — C and we arrive at a contradiction.
If the optimal sequence varies from one point in time to another, it is not true that the optimal
solution to (1) always is a term in any optimal sequence. It can be a term which never appears in any
optimal sequence at any time. Therefore, the use of optimal sequence is interesting only when they are
identical at any time between zero and T. By extension, we shall call such an optimal sequence con
sistent.* As shown by our former example, it is very difficult to establish the consistency for two
reliability curves even in the case of a very simple failure law. A fortiori, it is very difficult if not im
possible to establish such a property for a set, such as an optimal sequence, which cannot be formally
defined mathematically. Therefore, the only way of checking the consistency of an undominated se
quence is to compute optimal sequences for different times and see if they are identical. If they are for
a reasonable number of trials, it can be assumed that the undominated sequence is consistent. Naturally
consistency on the optimal sequence is a far weaker restriction than consistency on the whole set of
reliabilities.
3. PROPERTIES OF THE FUNCTION
If the standbys are in a loaded mode and if the unreliability of a component at stage i is Pi(t),
formula (1) becomes
P ( T f[ {l[pi(t)] m i}dt^miCi = P [ T R(m,t)dtC
n
by taking C= V mjC;. (see [4].)
i=i
For such a situation, we have the following properties. Proofs from previous chapters can be used to
verify these properties.
PROPERTY I: If the standby units at each stage are in a loaded mode, if Pi(t) ^ Pj(t) for all t
between and T and if c t < cj, then at the optimal solution, the optimal number of components at
stage i, m* is larger than or equal to the optimal number of components at stage /, m*.
PROOF: See the proof of Theorem I in [3].
This property reduces the number of solutions we must consider when using a branch and bound
technique.
*This property implies more than consistency on the set of indices of the optimal sequence and less than consistency for
all the possible indices.
398 C. G. HENIN
r r
PROPERTY II: If P increases, I Rdt and C are nondecreasing.
PROOF: As the proof of Theorem VI in [4].
COROLLARY: If the undominated sequence at any time is consistent, then R and C are non
decreasing functions of P.
This property and its corollary yield a lower bound on the cost which can be used if a solution to
the problem is known for a certain value of P smaller than the present one.
Consider variations in the duration of the mission T. Assume that R(t) and C are the reliability
function and cost of the optimal solution for a duration T, and R' (t) and C the reliability function
and cost of the optimal solution for a duration T'. For simplicity, notational purpose take
Z(T) = ( T R(t)dt and Z(T')= T R' (t)dt.
THEOREM II: If V > T, then Z{T) Z(T) ^Z'(T') Z'{T).
PROOF: By hypothesis, the following relationships hold:
PZ(T)C^PZ'(T)C
PZ(T')C^PZ'(T')C
These two inequalities imply that
Z(T)Z'(T) ^Z(T')Z'(T') otZ'(T')Z'(T) ^Z(T')Z(T).
COROLLARY: If the undominated sequence (at a given time) is consistent, then T' > T implies
that R'^R and C 3* C.
This theorem and its corollary show that as T increases, the corresponding optimal solution is
either a more costly and more reliable one in the case of consistency of the undominated sequence or
a solution such that / Rdt becomes greater between T and the new horizon if the undominated se
quence is not consistent.
THEOREM III: If the undominated sequence at a given time is consistent and if P' 2= P, then
at the optimal solution, m'* 3* m*.
PROOF: Suppose that there are m* components at stage i with P and {m* — v) = m'.* with P'
(v integer >0). Let R' and /?'' be the reliability on all stages, i excepted (i.e., RIRt and R' IR\) in the
optimal solutions corresponding to P and P' , respectively. Similarly let C { and C be the cost of these
(n — 1) stages in the corresponding solutions.
PROOF: By the corollary of property I, R' (t) 5* R(t) at the corresponding optimal solutions.
Thus, R'' > R\ Moreover, by hypothesis,
(i)
P rRiR'dtOmfct&P V R'iR*  O  (m*  V)a or P [* (/?,  R'JWdt 5* vet.
Jo Jo Jo
Similarly,
ALLOCATION OF UNRELIABLE COMPONENTS 399
(ii)
P' r RlR'Ut  C'  (m*  v) Ci ^ P' [ T RiR'idt  C' 1  m* Ci or vc> s= P' T (/?,  R'JR'Ut.
Grouping (i) and (ii), we get
P f 7 {RiR'iWdt^P' ( T (RjRDR'tdt,
J () Jo
and,asP< P' and R t {t) 5* /?•(*) , we get R j >/?''. Hence, a contradiction.
This theorem allows us to determine a minimum (maximum) number of components at each stage,
if a solution is known for a smaller (larger) value of P and if the undominated sequences are consistent.
Generally, the reliabilities R,(mi) are piecewise concave in m,. To assume such a property is not a
drastic restriction. In the case of loaded standbys, it is always satisfied. In the case of unloaded stand
bys, this is not always true. However, if we consider for example, components with exponential failure
laws with mean 1/A., the gain in reliability by passing from m units to (m+ 1) units is, for a one stage
problem, (Kt) m e~ xt l(m\). This gain is a decreasing function of m for \t < m. Generally, problems
will remain in such a situation because, if at the optimal solution m is less than \t, for a time t less
than 7\ this implies that the reliability at the end is not very high. For m= 1, this implies a reliability
smaller than e' 1 at time T*
If Ri(mi) is piecewise concave for all i's, that our assumption holds, we have the following
properties.
Suppose that, for each stage, a maximum reliability attainable has been computed, i.e., at most
Zi(t) the limit of /?,(m,, t) when m, goes to infinity; this limit always exists because /?, is increasing
in mi and bounded by 1. Let R' XJ (M being used for representing the maximum) be the corresponding
maximum reliability on all the stages, i excepted.
PROPERTY III: If the reliability at each stage is a concave function of the number of components
at this stage, there exists an upper bound, A/,, on the number of components at each stage i such that
r^minfybP f
Mi=mm\k:P (/?,(£+ 1, t) Ri(k, t))R[ 1 dt ^ C, .
PROOF: The proof is analogous to that of Theorem V in [4].
Similarly, if we have a minimum number of components at each stage computed, and R) is the
minimum reliability on all the stages, i excepted, the following property holds:
PROPERTY IV: If the reliability at each stage is a concave function of the number of compo
nents at this stage, there exists a lower bound on the number of components at this stage, L,, such that
Ls = max
lk:P [ (R i (k,t)R i {kl,t))R i L dt^c].
These properties are also valid in the discounted case. A study of the asymptotic behavior of the
function (1) yields the following propositions.
*We will show later that the gain on the integral of/?,: i.e., {Ri (m,, t ) — Rt (mj — ],t))dt,Ka decreasing function of m, and
therefore, the following properties are also fully applicable to this situation.
400 C. G. HENIN
THEOREM IV: If P» <», then at the optimal solution, m* tends to °° for all i.
PROOF: mf tends to », because L< does. In order to have Li ^ N (N— arbitrary large), it is sufficient
to take
P^d J j T (Ri(N, t) Rt(Nl, t)R[dt,
where /?j (t) can be taken as R'(l, . . ., 1; t) to prove the theorem.
When T tends to infinity, results are less sirriple. However, it is possible to give some results
depending upon the integrability of the reliability functions.
THEOREM V(a): If T = oo, and if there exists an index ;' , such that Z } {t) be integrable on [0, «>] ,
then at the optimal solution to (1), m* remains finite for all i.
PROOF: We apply Lebesgue's theorem to the function
Jo
/(m,) converges to / = Zj(t)/?^ and is bounded by Zj (t)dt € J? i. Then 7(/(toj) ) e J£ \ and converges
to/(/).
Therefore, I(f(m.i)) — I(f(mi — 1)) converges to zero for m* going to infinity. From Property III,
there exists a maximum number of components M, for stage i (»= 1, . . .., n).
In the discounted case, it is easy to see that the above theorem is always applicable: it is sufficient
to replace Ri by R' t — e~ rtln Rj to see that Z,(r) < e~ rtln is integrable on [0, »]. Therefore, in the dis
counted case, when the horizon is infinite, the optimal solution to the problem remains finite.
THEOREM V(b): If none of the Z, belongs to JP U but if
(3)
lim  (R j (m j ,t)Rj(m j l,t))L>dt<C j IP€,
' Jo
for € positive number, then m* remains finite when T goes to innnity.
PROOF: By contradiction, if we take m* 2* M, with M large enough for
r
(Rj(M, t) Rj(Ml,t) )U dt < CjlP
M
we get M ^ Mj and thus a contradiction.
For other situations, the asymptotic behavior of m* is unknown. The function R may admit at
least two maxima, one at finite range and the other at infinity. A priori, it is impossible to determine
which one of these maxima would be the global one. However, if the left hand side of (2) is > CjIP + e,
making m, going to infinity, increases the value of (1) to infinity.
4. AN EXAMPLE: UNLOADED STANDBYS
A situation which may frequently arise is the case of unloaded standbys. In this situation the
standbys units have no probability of failure until they are put in the system. This may frequently
ALLOCATION OF UNRELIABLE COMPONENTS 401
happen when only one stage is used; i.e., a system consists of one component and spare parts which
are introduced in the system when the main component fails.
Let q(t) be the unreliability of one of these components, Q n (t) the unreliability of the system made
of one basic unit and (n — 1) standbys and let A'„(t) be the first derivative with respect to t o(Q„(t).
The unreliability of the system is given by the following expression [2];
Qn{t)= [' q{tr)Q' n ^{T)dT forn>l
Jo
Q i (t) = q(t).
Such a law can be computed or approximated in most cases. For an exponential failure law, this formula
becomes
(4) Qn(t) = l^[(kt) k lkl]e M ,
and (4) is easily integrable. The benefit of adding one unit to the system: i.e., the benefit of passing
from n units to (n + 1) units is
P I (\t)"lnle~ Kl dt — c (if c is the cost of a component of this one stage system).
By solving the integral, this function becomes
P[le xr (l+ . . . + (\D"/n!]/ x c.
This function represents the profit of adding one unit to a one stage system with one basic unit and
(n — 1) spare parts. It is decreasing in n and negative for n going to infinity. The solution n* to our prob
lem is therefore the smallest number n, such that the above expression is negative.
In the case of other failure laws, the integrals are not that easy to solve. If the unreliability of the
unit q{t) remains bounded between two linear functions of t : \t *£ q(t) «£ kt, the unreliability of the
system is bounded between the unreliabilities of two components with exponential laws with parameters
X and X. Therefore, the above formulas applied to X and X give approximate solutions to the problem.
In the case of multistage problems, the above type of solution is not valid and the problem is more
complex. However, for exponential failure laws, for example, all the terms in R, made of the products
of expressions such as (4), can be integrated. If, for each stage, we compute a number nt, as above,
this number is not the optimal solution, but an upper bound on the number of components at each stage
because the profit of adding a component is not
fT
P I (R ( (mi,t)Ri(mil,t))dtCu
but
(5) P f T (Ri(m i ,t)Ri(m i l,t))R i dtc t .
Jo
402 < :  G  HENIN
If a lower bound is known on the number of components at each stage, (5) can be used, by replacing
/?' by/? 1 to generate new lower bounds on the number of components at each stage.
5. COMPUTATIONAL PROCESSES
The former theorems, enable us to parametrize the problem as soon as an optimal solution is found
for a value of P and T. However, the main difficulty remains the determination of such an optimal
solution. We propose the following algorithm (in the case of/?, or I Ridt concave in m*) :*
Jo
(a) Compute the limit for m going to infinity of R, ; (m, t) (i= 1, . . .,n).
(b) Compute a maximum number of components at each stage (Property III). Stop the process
when no further improvement is possible. Similarly, compute also the minimum number of components
at each stage. If the minimum and maximum number of components at each stage coincide for each
stage, stop. This is the solution. Otherwise, go to the next step.
(c) Select different times and compute the undominated sequence for each of these durations,
the number of components at each stage remaining between the bounds determined at the former step.
If these optimal sequences are identical, assume that the optimal sequence remains consistent (even
if the system is not consistent, the solution will be very near the optimal one) and go to step d. If the
optimal sequences remain identical, except for costs between a and b, assume that the optimal sequence
is consistent for costs smaller than a and larger than b and go to step/. If the optimal sequences do not
satisfy one of the two categories above, go to step e. **
(d) Compute the value of (1) for all the terms of the optimal sequence and terminate by taking the
terms giving the largest value of (1). This is the optimal solution (or a solution very close to it).
(e) In this case, apply the branch and bound technique described in [3] and [4]. There are no funda
mental, theoretical difficulties for its applications to the present problem; however, at each step, the
integral of the reliability of the current solution considered in the algorithm must be computed, which
lengthens the computations. The initial solution and lower bound on the function is either (0, . . ., 0)
or (Mi, . . ., M„), whichever one gives the largest value of (1).
(f) Compute the value of (1) for all the terms of the optimal sequence for costs smaller than a or
larger than b. Take the term H among them which gives the largest value of (1). Then apply the branch
and bound algorithm, as in step (3), with the following restrictions: all the complete solutions must
have a cost between a and b and the initial solution and lower bound are H and the corresponding
value of (1). The solution given by this algorithm is the optimal (or very close to the optimal) solution
of the problem.
NOTE: It does not seem possible to use an approximate solution as we did in [4], because the can
cellation of the first derivatives of the objective function does not give a solvable system of equations
as in [1]. In some cases, however, (1) is also a concave function of the rrij. From [4], this is certainly
true in the region of m;'s such that
^(i/?,(^,r))^i,
i = l
*As mentioned before, in the discounted case, Ri is replaced by e r,ln Ru
** In the case where the undominated sequence would remain consistent except on some intervals [aj, . . ., bj\, it would
be possible to extend the method described in/.
ALLOCATION OF UNRELIABLE COMPONENTS 403
or everywhere if we have loaded standbys. If (M%, . . ., M„) satisfy such a condition, it seems possible
to find a local maximum (which would probably be a global one too) by applying the same routine de
scribed in [4]. Now, if this solution did not satisfy (4), we would have to show that the function (1) is
still concave at that point or forget it and apply the algorithm described above.
REFERENCES
[1] Fan, L. T., C. S. Wang, F. A. Tillman, and Huang, "Optimization of System Reliability by the
Discrete Maximum Principle," IEEE Transactions on Reliability, lb (Sept. 1967).
[2] Gnedenko, B. V., Y. K. Belyayev, and A. D. Solovyev, Mathematical Methods of Reliability Theory
(Academic Press, 1969).
[3] Henin, C. G., "An Algorithm for Maximizing Reliability Through System Redundancy," Carnegie
Mellon University, Management Sciences Report #216 (Aug. 1970).
[4] Henin, C. G., "Computational Techniques for Optimizing Systems with Standby Redundancy,"
Nav. Res. Log. Quart. 19, 293308 (June 1972).
[5] Ketelle, J. D., "Least Cost Allocation of Reliability Investment," Operations Research, 10, 249265
(1962).
A CONTINUOUS SUBMARINE VERSUS SUBMARINE GAME
Eric Langford*
University of Maine
Orono, Maine
ABSTRACT
This paper analyzes, from a gametheoretic standpoint, the simultaneous choice of
speeds by a transitor and by an SSK which patrols back and forth perpendicular to I Ik
transitor's course. Using idealized acoustic assumptions and a cookiecutter detection model
which ignores counterdetection, we are able to present the problem as a continuous game,
and to determine an analytic solution. The results indicate that with these assumptions
there are conditions under which neither a "go fast" nor a "go slow" strategy is optimal.
The game provides a good example of a continuous game with a nontrivial solution which can
be solved effectively.
INTRODUCTION
This paper analyzes from a gametheoretic standpoint the simultaneous choice of speeds by a
transitor and by an SSK which patrols back and forth perpendicular to the transitor's ( nurse. (An SSK
is a submarine whose mission is directed against enemy submarines.) The payoff, in effect , is taken to be
the SSK's detection sweep width. This game was originally investigated by D. H. Wagner and E. P.
Loane in classified reports during 196364. Their treatment was confined to choices of .speeds from a
discrete set, but applied to rather general acoustic conditions. The present analysis assumes an ideal
ized form of propagation loss versus range and of noise versus speed. These idealizations permit the
sweep width to be expressed as a convenient continuous function of the two speeds; accordingly, they
allow each player to make choices from a continuum of speeds, rather than from a discrete set. They
also permit a comprehensive analysis of the variety of cases which can arise within the idealizations.
Subsequent to Wagner and Loane's work, an approach to this game was undertaken by Mathe
matica [6], treating a continuous analytic payoff function based on idealized acoustics. Unfortunately,
inconsistent acoustic assumptions were used in [6]; these were corrected by Mathematica in a subse
quent report [1\ Motivated by [6] (and prior to [1]), the present analysis was undertaken, again using a
continuous payoff function, but using acoustic assumptions generally felt to be consistent.
In this paper, we assume that the graph of noise versus speed is linear above a breakpoint speed
below which noise is independent of speed; this is a common assumption and was made in [6]. We
also make the usual "spreading law" assumption, i.e., that propagation loss is proportional to k'th
power of range when loss is expressed in power units. This of course is equivalent to the assumption
that propagation loss is proportional to the logarithm of the (k'th power of) range when loss is ex
pressed in decibels. (The inconsistency in [6] appears to be on this point.) We take for a payoff function
what is essentially the SSK's kinematicaUy enhanced sweep width; the SSK. of ionise, attempts to
*Research on this paper was performed when the author was with the Naval Postgraduate School and Daniel H. Wagner,
Associates.
405
406 E. LANGFORD
maximize this quantity and the transitor attempts to minimize it. The SSK will thus be maximizing
his detection probability if we assume a cookiecutter detection model, i.e., that detection occurs when
and only when the signal excess (assumed deterministic) reaches a threshold value. We ignore counter
detection by the transitor. This will be a realistic assumption if the SS/Ts acoustic capability is far
superior to that of the transitor. The more general problem, which takes into account counterdetec
tion and which allows for a stochastic detection model, appears to be quite difficult to solve within this
framework. However, in a discretized form, it was handled satisfactorily by Wagner and Loane in the
abovementioned reports.
The game is described in abstract terms in the first section, i.e., the payoff function is given in a
form which is equivalent (for purposes of the game's analysis) to the formula for theSSK's kinematically
enhanced sweep width. (The SSK's effective sweep width is increased by his back and forth patrol.) The
properties of the payoff function are developed into geometric criteria for solution. In the second section,
graphical methods of finding the solution are given and illustrated by examples. A numerical solution is
described in the third section. The fourth section enumerates the possible outcomes as combinations of
the pure and mixed strategies. Identification of the abstract game with the real SSK versus transitor
game is given in the final section.
In an earlier version, this paper was submitted as a Memorandum to Commander, Submarine
Development Group Two, in New London [4]. In this earlier version are included graphs of all possible
cases and a Fortran computer program to solve the game. A subsequent classified memorandum applied
the analysis to some "real life" numbers.
DESCRIPTION OF THE GAME
We consider the following game. The maximizing player (SSK) chooses a speed u in the range
*£ "min < « ^ "max* and the minimizing player (transitor) chooses a speed v in the range
< v min ^ v *£ t> max  The payoff function is
F(u,v) = e cv  U V l + u 2 / v 2 ,
where c > is a constant.
The explicit identification of this payoff function with the SSK's detection sweep width will be
clarified later. For the time being, we treat the game abstractly.
Computing the partial derivative with respect to v, we see that
F 2 {u, v) = F(u, v)\c " .
L v{u 2 + v 2 )\
The second partial derivative with respect to v is
p ( \ ~i \\\ u 2 I 2  u 2 (u 2 + Sv*) }
F 22 (u, v) = , (a, v) [c  v ( u% + yi) J + ^ + ,,2)'].
so that F22 >0. That is, F is convex in its second argument; it is wellknown that this implies the
following facts (see [2, p. 80] or [5, p. 259]):
1. The minimizing player (transistor) always has an optimal pure strategy.
CONTINUOUS SUBMARINE GAME 407
2. The maximizing player (SSK) always has an optimal mixed strategy involving at most two speed
choices; i.e., he either has an optimal pure strategy, or an optimal mix of two speeds.
Let us fix v and consider F as a function of the one variable u. Since F is continuous and restricted
to a closed bounded interval, it assumes a maximum value; this maximum must be taken on at an
endpoint (i.e., either u min or u max ) or at an interior relative maximum, which can, in this case, be
located by differentiation. To this end, we form the partial derivative of F with respect to u, and equate
it to zero:
Fi(u, v) = F(u, v)
a 2 + v 2 J
0.
Now F never vanishes, so that Fi (u, v) — iff u = u 2 + v 2 . The solution set to this equation in the u — v
plane is the semicircle
{(u,v): (u  1/2) 2 + v 2 = 1/4, and v > 0}.
From the form of Fi, it is evident that F\{u, v) > if (u, v) is inside the semicircle, and that
F\(u, v) < if (u, v) is outside the semicircle. Thus the graph of F(u, v) versus u for fixed v can have
three qualitatively different forms as shown in Figure 1. The case v= 0.6 is typical of v > 1/2 and the
case v — 0.4 is typical of v < 1/2.
1.8 
1.7 
"3LI6 *
1.4 
1.3
v=0 6
STRICTLY DECREASING
v=0 5
INFLECTION POINT
v =0.4
1.0
Figure 1. Qualitatively different forms off.
408 E. LANGFORD
Let us define the function <p as follows:
ip(v) = max {F(u, v): u min ^"^« max }
From the above observations, it is not difficult to infer the following:
(1) If v 2* 1/2, then <p(v) = F(u min , v).
(2) If v < 1/2, then
(1) (p(v) = max {F(u min , v), F(u max , v), F(u , v)} ,
where u = V2 + VV4 — v 2 , if this falls within the interval u min ^ u ^ u max ; otherwise we ignore u .
Since F is convex in its second argument, v, it follows that <p is continuous. Its minimum value will be
the value of the game; moreover, if <p assumes its minimum at v— v 0< then ^0 is an optimal pure strategy
for the minimizing player. By the convexity of F, ip is unimodal (hence the above vo is defined uniquely).
Since <pis unimodal, we can locate its minimum numerically with great efficiency using a binary search;
graphical methods are also possible.
Similarly, we can fix u and solve the equation Ft{u, v) =0:
F 2 {u,v)=F(u,v)[c 1 ^^
= 0.
v 2 ).
The solution set to this equation is
{(u,v):cv i + cu>v=u 2 , (u,v) # (0,0)}.
By the CardanTartaglia formula, this defines vasa function of u as follows:
v/2c V4c 2 ^27^W2c V4c 2 ^27
Conversely, if v < 1/c, we can solve for u as a function of v :
I CV*
Figure 2 shows the graphs of F l (u, v) = and F 2 (", v) = for several values of c.
The intersection of the graphs of F\ (u, v) = and F 2 (u, v) = will give a saddle point if u > 1/2,
i.e., if c > 1. If there is a pure strategy minimax solution to the game which is interior for both players
(i.e., neither u mln nor u max for the SSK and neither v min nor v max for the transitor), then the solution
must occur at this saddle point. (Edge minimaxes need not be of this form.) To find this saddle point,
we solve the following equations simultaneously:
F,(u, v) = F 2 (u, v) = 0.
CONTINUOUS SUBMARINE GAME
409
c = 05
c=0.75
c = I.O
c=l.5
c=20
>F 2 (u,v)=0
1 2 3 0.4 0.5 0.6 0.7 8 09 10 SSK's SPEED u
Figure 2. Graphs of F t (u, v)=0 and of F 2 (u, v)=0 for several values of c.
Strangely enough, the answer is exceedingly simple:
u =
c 2 +l
v =
c 2 +l
Note that there can be no interior minimax if
1.
GRAPHICAL SOLUTION OF THE GAME
Define the function/as follows: For any uo, u min n "o =£ u max , we define vo=/(«o) iff F(u , vo) = min
{F(u , v): v min =£ v =s fmax} That is, for any admissible «,/(«) is the v which minimizes F. By the
vconvexity of F, f is a welldefined, singlevalued function. Moreover, / is continuous and monotone
increasing. In fact, if we stay away from v min and tt max >/is strictly increasing; more precisely, if U\ < u%
are such that v min < /(«i) =£ f{uz) < v max , then f is strictly increasing on the interval [u\, u 2 ]. See
Figures 5, 6, and 7 for examples.
Actually, we can give a simple formula for/. Let h(u) be the solution to F 2 (u, h(u)) = 0, i.e.
\2c V 4c" 27 ^ \ 2c V4c 2 + 27
410
E. LANGFORD
For unrestricted v, F(u , v) is convex and has an absolute minimum at v=h(u»). Therefore if u min
*£ h{u ) =£ f max , then /(u ) = h(u ). If h(u ) 2* f max , then /(u ) = f max , and if h(u ) *£ v min , then
f(uo) = fmin We can summarize these cases as follows:
/(u„) = min {v max , max [v min , h(u„)]}.
It is straightforward to determine /(«o) graphically. Let «o be fixed and consider the line segment
S={(u, v): u = u and v min =£ v ^ v max }
The three possibilities, namely /(uo) = v min ,f(uo) = f max , and/(u ) = h{u ) can be exhibited graphi
cally as follows:
(1) If the line S lies completely above the graph of h{u) versus u, then/(u ) = t>mnv
(2) If the line S lies completely below the graph of h(u) versus u, then/(« ( >) = i> max .
(3) If the line S intersects the graph of h(u) versus u at («o, Vo), then/(u ) = vo = h(u ).
These three possibilities are graphed in Figure 3.
07
1 0.6
Q
LU
Ul
a.
OT 0.5
S IN CASE 3
S IN CASE 1 S^
s
IN CASE 2
(A
O
m 04
<
or
h
03
02
/
1
/ i i i i i i 1 1 _
i I
0.1 0.2 0.3 04 05 0.6 07 0.8 09 10 I.I
SSK's SPEED u
FlGURE 3. Three possibilities in the determination of f(u); (c = 1 in this example).
In a similar fashion we define the "function" g as follows: For any vo, v mit
Vo
^max? w e define
Uo — g(vo) iff F(u , vo) — max {F(u, vo): "min ^ u ^ "max) = <p(vo) From Figure 1, it is clear that this
need not, in general, define a singlevalued function. We shall subsequently show that either g is a
continuous singlevalued function or that the graph of g consists of two continuous pieces which over
lap only at an endpoint. More precisely, in this second circumstance, there exists a point t>n such that
Vmin ** vo *£ v max , and such that g restricted to either of the subintervals [v min , v ) or (vo, fmax] is a
continuous singlevalued function; however at v = vo, g(v) is bivalent; it has the two values g(vo — 0)
and g{vo + 0). (It will also be shown later that this righthand limit can only be u m in a °d that the left
CONTINUOUS SUBMARINE GAME 411
hand limit can be either u max or u = 1/2 + Vl/4 — v%. ) In either case, g is monotone decreasing:
g(v\) ^ g(v 2 ) whenever v\ ^ v 2 . Figure 5 illustrates the first possibility, while Figures 6 and 7 illustrate
the second.
Note that when we fix v and regard F(u, v) as a function only of u, the constant c enters the ex
pression for F(u, v) only as a constant multiplier. Since we are interested in locating the maximizing
u for fixed v , it follows that the value of c is unimportant in the discussion which follows. That is, the
maximizing u is located in the same place no matter what value c has.
SSK's SPEED u»
2 3 4 5 6 0.7 0.8 0.9 1.0
Figure 4. Six possibilities in the determination of g(v).
We refer to Figure 4: the semicircle ABC is the locus of Fi(u, v) = 0. The points on the quarter
circle AB are relative minima for F(u, v) considered as a function of u for fixed v, and the points on
the quartercircle BC are relative maxima. The point B itself is an inflection point (cf. Figure 1). The
curve DB is defined as follows: if (iti, vo) lies on DB, then F(ui, vo)—F(u 2 , vo), where (u 2 , v ) lies
on the quartercircle BC. The curve DB is thus obtained as the solution set of the following transcen
dental equation
F(u, v) = F(l/2 + Vl/4  »*, v),
where < u «£ 1/2.
Let Vo be fixed and let S' denote the line segment
{(u, t>):u min s£ u=s£u max and v = v }.
There are six possibilities:
(1) The line S' lies completely within ABC; then g(vo) = "max The line S' may meet, but not cross
ABC.
(2) The fine S' lies completely outside of ABC; then g(vo) = u m in The line S' may meet, but not
cross ABC.
412 E. LANGFORD
(3) The line S' crosses AB, but does not meet or cross DB. In this case, g(vo) may be « m j n , "max,
or both, depending on the relative sizes of F(u min , vo) and F(tt max , vo)
(4) The hne S' crosses DB; then g(vo) = u min .
(5) The line S' meets or crosses BC, but does not meet DB. In this case, the maximizing Uo occurs
at the intersection of S' and BC. Evidently g(vo) = «o~ Va + VV4— vjj.
(6) The line S' meets, but does not cross OB. In this case, g(vo) = u min , unless S' also meets or
crosses BC. In this latter circumstance, g(vo) has the two values « m in and uo, where Uo is at the inter
section of S' and BC as in the case above.
These possible cases are all graphed in Figure 4.
If we graph g(v ) versus v, the graph will be a nice continuous curve as long as g(v) "stays put",
i.e. as long as g(v ) is one or the other of the endpoints or is uo. Problems arise at the two "transition
stages":
A. As in Case 3, when F(u min , vo) = ^("max, vo) = <p(vo). This will be called a type A transition.
B. As in Case 6, when F(u min , vo) — F(uo, vo) = (f(vo). This will be called a type B transition.
As v increases through a transition stage Vo, g(v) will make an abrupt jump. Precisely at the transition
stage v — to, g(v) will be twovalued. It will now be shown that as v increases through the transition
stage Vo, one of the following two things will happen:
A. If Vo is a type A transition, then#(^o) must jump from « max to u min .
B. If vo is a type/? transition, theng(vo) must jump from Uo tou min .
It is a consequence of the above that g can have at most one transition stage, since any transition
puts g(v) in the "absorbing state" "mm An example of a type A transition is found in Figure 7, and
an example of a type B transition is found in Figure 6.
To prove the above assertion, let us suppose that «i < «2 and that for some Vo,F{u x , vo) —F{ui, vo)
If we define the function G(v) = F(u\, v) — F(« 2 , v) , then by assumption, G(vo) = 0. By computing
the (continuous) derivative C (v) , it is easy to show that G' (vo) > 0, so that G is increasing in a neighbor
hood of i>o; in other words, as v increases through fo, F(u\, v) is first less thanf(u2, v ) and then greater,
so that a transition can occur from a larger value of u to a smaller value, never the other way.
The "function" g is monotone decreasing since it decreases at its transition point (if there is one)
and since it obviously must decrease at points other than transition points. As is the case with /, if
V\ < t>2 are such that u min < ^C*^) ^ g^i) < "max, then g is strictly decreasing on the interval [vi,v 2 ].
If we plot both /and g within the rectangle of admissible speeds defined by
{(a, v) : u min =S u *£ «, Ililx and v min ^v^ t> max } ,
then one of two things will occur:
(1) The two graphs will intersect at a single point («o, vo). In this case, «o is an optimal pure
strategy for the maximizing player (SSK) and t>o is an optimal pure strategy for the minimizing player
(transitor). See Figure 5 for an example of this case.
(2) The two graphs will not intersect. More precisely, there will exist a transition point fo such
that g(v ) has the two values «i and "2, where Ui < u> and where «i <f~ l (vo) < u 2 . See Figures 6
and 7 for examples of this. Note that/" 1 is defined at Vo since/is strictly increasing as it passes through
CONTINUOUS SUBMARINE CAME
413
TRANSITOR's MINIMUM SPEED IS 005 AND MAXIMUM SPEED IS 070
SSK's MINIMUM SPEED IS 0.50 AND MAXIMUM SPEED IS 095
OPTIMAL PURE STRATEGY FOR TRANSITOR IS 04000
OPTIMAL PURE STRATEGY FOR SSK IS 8000
c = 2.0
071
0.5 
SSK's SPEED u
0.2 0.3 04 05 0.6 07 0.8 0.9 10
Figure 5. Graphical solution of the game — Example 1.
TRANSITOR's MINIMUM SPEED IS 25 AND MAXIMUM SPEED IS \ 60
SSK's MINIMUM SPEED IS 15 AND MAXIMUM SPEED IS 90 \ c = 10
OPTIMAL PURE STRATEGY FOR TRANSITOR IS 4591
OPTIMAL MIXED STRATEGY FOR SSK
0.1500 A FRACTION 3972
0.6981 A FRACTION 0.6028
i i i i 1 1
g
SSK's SPEED u»
2 0.3 4 0.5 6 07 8 9
Figure 6. Graphical solution of the game— Example 2.
10
the hole in the graph of g. In this case, the minimizing player again has the optimal pure strategy Vo,
but the maximizing player now has an optimal mixed strategy given by u, with probability p and u 2
with probability 1 — p. The constant p is found by solving the following equation:
P'Fa(ui, vo) + (1 ~p) • F 2 (u>, t> ) = 0.
414
E. LANGFORD
07
TRANSITOR's MINIMUM SPEED IS 0.15 AND MAXIMUM SPEED IS 0.40
SSK's MINIMUM SPEED IS 0.00 AND MAXIMUM SPEED IS 20
OPTIMAL PURE STRATEGY FOR TRANSITOR IS 0.2852
OPTIMAL MIXED STRATEGY FOR SSK:
0000 A FRACTION 1350
2000 A FRACTION 8650
c=IO
SSK's SPEED u —
2 0.3 4 5 6 7 8 0.9
Figure 7. Graphical solution of the game — Example 3.
No other possibilities (e.g., two intersections) can arise by the aforementioned monotonicity properties
of/andg.
Three numerical examples of this graphical solution are given to make this more clear. In each
of these figures, the graph of/ is indicated by dash line and the graph of g is indicated by a dotdash
line. The box is the rectangle of admissible speeds defined above.
NUMERICAL SOLUTION OF THE GAME
The following is a stepbystep procedure for solving the game. It essentially repeats the graph
ical procedure, but from a point of view which emphasizes suitability for computation. This procedure
has been programmed for the GE235 computer: a copy of the program and sample output are included
in [4]. The following stepbystep procedure can be considered a macroscopic flow chart of the com
puter program.
(1) Locate the minimum of <p(v ) = max,, F(u, v). Since <p is unimodal, this can be done easily by
iterative computation using a binary search. The function (p(v) itself is evaluated by using formula
(1) derived earlier. The minimum value of <p(v) is the value of the game.
(2) Let vo be that unique number such that <f>(vo) = min t , <p(v). This is obtained automatically as
a byproduct of step 1. Then v is the optimal pure strategy for the minimizing player (transitor).
(3) Solve the equation F(u, vo) = <p{vo) for u by checking the three possible places where the max
imum could occur (viz. u mln , u max , and uo= V2+ VV4 — t^). If there is a unique solution «*, then u* is
an optimal pure strategy for the maximizing player (SSK). If there are two distinct solutions U\ < U2,
then a mix of Ui with probability pand u 2 with probability 1 — p will be optimal. The number p is the
solution to the equation
pF 2 (u u vo) + (1p) •F 2 (u 2 ,vo)=0.
CONTINUOUS SUBMARINE GAME 415
ENUMERATION OF POSSIBLE OUTCOMES
From an examination of Figure 1 or otherwise, we see that there are five qualitatively distinct
possibilities for the maximizing player:
U(l). A pure strategy of u min .
U(2). A pure strategy of u max .
U(S). A pure strategy of Uo, where u mln < u < u max .
C/(4). A mixed strategy of u min and « max .
U(5). A mixed strategy of u min and Uo, where u min < Uo < u max .
Since the minimizing player always has a pure strategy, there are only three possibilities for him:
V(l). A pure strategy of v min .
V(2). A pure strategy of v mi)
V(S). A pure strategy of vo, where v min < Vo < v max .
Apparently there are 15 cases to consider. However it is known [5, p. 267] that in cases V{\)
and V(2), the maximizing player must also have a pure strategy. (This can be inferred also from the
geometric reasoning previously used.) Thus there are only 11 possible cases. In [4], examples are
given of each of these 11 cases. We remark that for certain values of c, some cases are forbidden. For
example, if c ^ 1, then the case U(3) — V(S) cannot occur.
IDENTIFICATION OF ABSTRACT GAME WITH REAL GAME
We shall now identify the foregoing abstract game with a real SSK versus transitor game.
The SSK moves back and forth at constant speed u across a rectangular barrier zone. The transitor
enters the zone on a course perpendicular to the SSK's course at a constant speed v' .
The payoff for the SSK is detection sweep width. He attempts to maximize this quantity and the
transistor attempts to minimize it. Let L s {v') denote the radiated noise (in decibels measured at 1 yard
from the source) of the transitor as a function of its speed v' , and let L N (u') denote the self noise (in
decibels) of the SSK as a function of its speed u' .
We assume that the noise curves of both SSK and transitor are of the form shown in Figures 8
and 9, respectively.
L N (u')
u'min SSK's SPEED u'
Figure 8. Noise curve for SSK.
RADIATED
NOISE
(db)
I min f
U S" _
_L
Ls(v')
v'min TRANSITOR's SPEED v'
Figure 9. Noise curve for transitor.
It is obvious that the SSK will never travel more slowly than u' min and that the transitor will never
travel more slowly than v' min . We therefore assume that L N (u') and £<?(*/) are both linear for all u'
and v , but in the analysis we will not consider any speeds less than these minimum speeds. The
416 E. LANGFORD
maximum speeds, u' max ana " "inaxt ar e of course determined by the physical characteristics of the
respective submarines. We have then the following formulas for L.v(u') and L s (v'):
L N (u') = L™ i »+b(u'u' min )
L s (v')=Lf"+a(v'v' min ),
where a and b are the slopes, measured in decibels/knot.
Assume that propagation loss obeys a spreading law, so that the decibel loss in propagating
from 1 to R yd is k logio R, for some fixed k > 0.
The unenhanced sweep width W for the SSK is given by
k\ogn>(JPI2)=L s (v')L N (u')+N DI NRp,
where N D i and Nrd are the SSATs directivity index and recognition differential. (Of course all terrr
in this equation are taken at the frequency and bandwidth of interest.)
If we solve this equation for W, we obtain
W = 2 exp {I (log 10) [L s W )  L N (a' )+N Dl  N RD ] }
= 2 exp  (log 10) [Lf n + av'  av' min  Lf n  bu' + bu' min + Ndi N RD ]\
= 2exp{!(loglO)[L i "a U ' mln L^^
If we make the following substitutions:
u=( logloW
*=(flogl0)*'
K=2 exp{ (\oglO)[LF"L% in av' min + bu' mn + N DI N RI >]}
c= v
then W is of the form
W=Ke cv ~ u ,
where K and c are positive constants.
CONTINUOUS SUBMARINE GAME 417
As noted by Wagner and Loane and by Mathematica, the kinematic enhancement of sweep width
in the back and forth patrol is a multiplicative increase in the approximate amount Vl+ (u'/v') 2 ,
providing the SSK's patrol legs are substantially longer than the acoustic sweep width W . (See [3,
Equation (7.1.4.)].) Since the kinematic enhancement factor depends only on the ratio u'/v' = u/v,
the results of the previous analysis apply. Here the primed variables, namely «' and v', refer to true
ship speeds (in knots). The unprimed variables, namely u and v, refer to the "normalized speeds" as
considered in the solution to the game. All graphs, etc. are referred to these normalized speeds.
ACKNOWLEDGMENTS
I would like to thank Dr. Daniel H. Wagner for his help in the preparation of this paper. The work
was originally supported by ONR Contract Nonr4784(00). The writing of this paper was supported
by an ONR Foundation Grant (FY1968).
REFERENCES
[1] Agin, Norman I., et aL, "The Application of Game Theory to ASW Detection Problems," Mathe
matica Report, Princeton, New Jersey (Sept. 30, 1967).
[2] Karlin, S., Mathematical Methods and Theory in Games, Programming, and Economics (Addison
Wesley, Reading, Mass., 1959).
[3] Koopman, B. O., "Search and Screening," OEG Report No. 56, Operations Evaluation Group,
Office of the Chief of Naval Operations, Navy Department (1946).
[4] Langford, E. S., "Game — Theoretic Analysis of Choice of Speeds by SSK and Transitor," Daniel
H. Wagner Associates Memorandum to CSDG2 (Nov. 17, 1966).
[5] McKinsey , J. , Introduction to the Theory of Games (McGrawHill, New York, 1952).
[6] "A Study of Optimal Patrol and Transit Strategies in a Rectangular Barrier Zone Using Mathe
matical Games," Mathematica Report, Princeton, New Jersey.
TOTAL OPTIMALITY OF INCREMENTALLY OPTIMAL ALLOCATIONS*
Lawrence D. Stone
Daniel H. Wagner, Associates
Paoli, Pennsylvania
ABSTRACT
This paper considers the problem of finding optimal solutions to a class of separable
constrained extremal problems involving nonlinear functionals. The results are proved for
rather general situations, but they may be easily stated for the case of search for a stationary
object whose a priori location distribution is given by a density function on R, a subset of
Euclidean rcspace. The functional to be optimized in this case is the probability of detection
and the constraint is on the amount of effort to be used.
Suppose that a search of the above type is conducted in such a manner as to produce the
maximum increase in probability of detection for each increment of effort added to the
search. Then under very weak assumptions, it is proven that this search will produce an opti
mal allocation of the total effort involved. Under some additional assumptions, it is shown
that any amount of search effort may be allocated in an optimal fashion.
1. INTRODUCTION
In this paper we consider the relationship between incrementally optimal allocations and totally
optimal allocations. Motivation for studying this relationship arises naturally in planning a search for
a lost object. Suppose that the search planner is given authorization to search for a fixed time interval,
and he conducts the search to produce the maximum probability of detection at the end of the interval.
If the search fails to detect the lost object within the allotted time, the planner may be given authoriza
tion to continue searching for an additional time increment. In this case the planner may allocate the
additional search effort to maximize the probability of detection in the given increment. Having done
this, one may ask whether the search could have produced a higher detection probability if it were
known in advance that both the initial time interval and the added increment were available.
In mathematical terms the search problem is to allocate optimally a given amount of effort in order
to detect a stationary object, the target, located in Euclidean raspace, R. There is a function / which
gives the probability density of the target's location. Suppose T is the amount of effort available for the
search. Then the search planner seeks a function q* :R—* [0, ») such that I q*(x)dx *£ T and
JR
(1.1) (b(x,q*(x))f(x)dx=maxU i b(x,q(x))f(x)dx:q^O,j i q(x)dx^Ty
The function b(x, •) is the local effectiveness function at x. That is, b(x, y) gives the conditional prob
ability of detecting the target given it is located at x and the effort density is y at x. The integral on the
left of (1.1) gives the probability of detecting the target when using allocation q*. The function q* is
called an optimal allocation. This problem has an obvious analog when R is replaced by a countable set
of locations or boxes.
This research was supported by the Naval Analysis Programs, Office of Naval Research, under Contract No. N00014
69C0435.
419
420 L D. STONE
For the case where b(x, y) — \—e v for xeR and y^O, Koopman [4, p. 617] made the following
observation. Suppose one allocates T\ amount of effort in an optimal fashion, but fails to detect the
target. An increment T t of effort then becomes available. If one allocates this additional effort in an
incrementally optimal manner (i.e., optimal considering the previous allocation of T\ amount of effort),
then one obtains an optimal allocation of T\ + T 2 effort. That is, two incrementally optimal allocations
produce a totally optimal allocation.
In [2] an incomplete attempt was made to show that incrementally optimal allocations produce
totally optimal allocations provided that db(x, y)/dy is a positive monotonic nonincreasing function
of y for xeR. In section 2 of this paper we show that for any Borel measurable local effectiveness
function, incrementally optimal allocations are totally optimal whenever the target's probability distri
bution is given by a density function as in (1.1). In the case where the search space is countable, we
prove that concavity of the local effectiveness function guarantees that incrementally optimal alloca
tions are totally optimal. In addition, it is shown by counterexample that this property need not hold for
countable search spaces if the local effectiveness function is not concave.
A search plan is called uniformly optimal if it maximizes the probability of detection at each instant
during the search. In section 3, we show the existence of uniformly optimal search plans under addi
tional hypotheses which are given there.
Our results hold in a more general situation than that of search theory. Thus, we introduce the
following framework which is substantially the same as that in [6], one difference being that we deal
only with Borel functions. Let R be a Borel subset of Euclidean rcspace. We fix Borel functions L and
U with L =£ U which are defined on R. The functions L and U may take infinite values.
Define fl={(x, y) :xeR, \y\ < °° and L(x) ^y^U(x)}. We fix a realvalued Borel function e
defined on ft and the family H of a.e. (with respect to Lebesgue measure) realvalued Borel functions
q defined on R such that L^q^U . For qes. we understand e(, q()) to be a function from R to the
reals. Define
4> = En{(7 : e(, <?(•)) and q are integrable},
and let
E{q) — I e(x, q(x))dx and C{q) = j q{x)dx for ge<P
Jn J*
All integration is Lebesgue integration. A q*e<f> is said to be optimal if
E (q* ) = max{£ (q) : 9 e<D and C(q) = C(q*)} .
In the case where L(jc) = 0, £/(*) = °° for xeR and e(x, y)=f(x)b{x, y) for (x, y)eft, E(q) be
comes the probability of detecting the target with allocation q and C{q) becomes the amount of effort
required by q. Then an optimal q* maximizes the probability of detection which can be obtained with
effort C(q*).
A function /defined on the real line is said to be increasing if y 3* x implies/(y) ^f(x). A function
/is said to be concave if for all x, y in the domain of /, f(ax+ (1 — a)y)) 3* ctf(x) + (1 — ot)f(y) for
Ossa^l.
2. INCREMENTAL OPTIMIZATION
For i = l, 2, . . .. let <7,€<I> be such that qi =£ q 2 *£ . . . . Let qo = L. If
INCREMENTALLY OPTIMAL ALLOCATIONS 421
E(qi) = max {E(q) : q ^ q\\, qe<l> and C{q) = C (<?,)} for i= 1,2,. . .,
then we say that (q t , q 2 , . ■ .) is an incrementally optimal sequence. If g, satisfies
E(qi) = max {E(q) : qe<P and C(q) = C(q t )} for i= 1,2,. . .,
then (<7i, q 2 , . . .) is said to be a totally optimal sequence. Define
f(x, y, A) = e(x, y) — Ay for — °° < A < oo, and (x, y)eft.
The function € is called a pointwise Lagrangian in [6] and A is a Lagrange multiplier.
THEOREM2.1: Let (<7i,g 2 , . . .) be an incrementally optimal sequence such that for i— 1,2, . . .
£(4i) < °° and C(qi) is in the interior of the range of C. Then (<?i, q 2 ■ . .) is a totally optimal
sequence.
PROOF: By the definition of incremental optimality, q x is optimal. Thus, by Corollary 5.2 of [6],
there exists a real number Ai such that for a.e. xeR
(2.1) /(*,<?,(*), A,) 3= /(*,y, A,) for \y\<°o and L{x)^y^U{x).
In other words a necessary condition for qi to be optimal is that it maximize a pointwise Lagrangian
for some multiplier A*. Similarly, the incrementally optimal nature of q 2 implies the existence of a real
number A 2 such that for a.e. xeR
(2.2) <f(x,q 2 (x),X 2 )^f(x,y,\ 2 ) for \y\ < » and qi (x) *S y =S U(x).
In order to prove that q 2 is optimal it is sufficient to find a real number A such that for a.e. xeR
(2.3) f{x, q 2 (x), A) &S(x, y, A) for \y\ < oo and L(x) =£ y ^ U(x).
The sufficiency of (2.3) follows from a well known result concerning Lagrange multipliers (see, for
example [3], [8] or Theorem 2.1 of [6]).
By (2.1) and (2.2)
(2.4) k 2 (q 2 (x)q l (x)) =£e(*, q 2 {x))e{x, q^x)) =£ A, (q 2 {x) —q x (x)) for a.e. xeR.
Recall that q 2 2= q x . If q 2 {x) = q\{x) for a.e. xeR, then (2.3) holds for A= Ai. If q 2 {x) > q t (x) for xina
set of positive measure, then (2.4) implies that A 2 «S Ai. In this case for a.e. xeR and y such that \y \ < °°
andL(x) ^ y *£ q\{x) , we have
0*Se(x, g,(x))e(x, y)  A, {q>{x) y) ^ e(x, q l (x))e(x, y)  A 2 (<7,(*) y).
That is for a.e. jcc/?,
(2.5) /(*, y, A 2 ) ^ /(*, <?,(*), A 2 ) =£ t{x, q 2 (x), A 2 ) for y < oo, L(x) ^y^qi(x).
422 L. D. STONE
Combining (2.5 ) and (2.2 ) we obtain (2.3 ) with X = \ 2 . Thus, qi is optimal. By repeating the argument
for 93, q4, • • •■> the theorem is proved.
We now shift our attention to the case where R is a countable set. That is for some countable
subset J of the integers, R — { xy.j e J }. Let
£(«)= £ e(*j, </(*,))
JO
C(q)=^q(xj).
JO
Carry over the definitions of incrementally and totally optimal sequences in the obvious way. One may
use the method of proof given in Theorem 2.1 to show that incrementally optimal sequences are totally
optimal for the case where R is countable provided that the existence of a real number A such that
(2.6) S(xj, q*( Xj ), A) = sup {S(x h y, K) : \y\ *s °o and L(x) <y*£ U(x)} for je}
is a necessary condition for q* to satisfy
(2.7) E(q*) = max {E(q) : 9 e<P and C(q)=C(q*)}.
From Corollary 5.3 and Remark 2.3 of [6] we conclude that if e(xj, •) is a concave function foryej,
then (2.6) is necessary for (2.7). Thus, we may state the following theorem.
THEOREM 2.2: If R is countable and e(xj, • ) is concave for;'cJ, then any incrementally optimal
sequence is totally optimal.
The following example shows that one cannot remove the assumption that e(xj, •) is concave in
Theorem 2.2. The example also shows that (2.6) is not necessary for (2.7) when e(xj, •) is not concave
for ji J.
EXAMPLE 2.3: Let /?={!, 2} be a doubleton set, L = 0, i/(l) = 2, and (7(2) = V3. Define
(1
e(l,y) = y 0=£y*£2, e(2,y)=]
2 y 0*Sy«U
Note that both e(l, •) and e(2, •) are everywhere differentiable. For *£ T 1 ^ 2+ V3, define
o^r< V3
W. i) [tVB, V3^T^2+V3
17(2, T)  ^ V3<r^2+V3.
Then 17(1, •), £=1,2, is increasing, and for each T 3* 0, C(tj(, T)) = T and E(r}(, T)) gives the
maximum of E(q) over all nonnegative functions q defined on {1, 2} such that C(q) = T. Note that
INCREMENTALLY OPTIMAL ALLOCATIONS 423
for q* = r}(, 1), (2.6) is not satisfied for any X. An example of a function q* which satisfies (2.7), but
for which there is no A. satisfying (2.6) is also given in [8].
One may check that the sequence of allocations {q\, q 2 ), where qi(l) — l, qi(2) =0and g 2 (l) = 1»
<jr 2 (2) = l, is incrementally optimal. However, ^(q 2 ) = l and C(q 2 ) = 2, while
E(ri(, 2)) = 2y 2 V3>l
so that 92 is not optimal, i.e., (gi, q 2 ) is not totally optimal.
In [2, p. 328] it is claimed that (in our notation) the existence of a function 17 defined on /?X [0, S]
such that tj(, T) is an optimal allocation of T amount of effort for each ^ T ^S and r)(x, •) is in
creasing for xeR guarantees that incrementally optimal sequences are totally optimal. Example 2.3
shows that for discrete R this claim does not hold. If in addition to the existence of a function 17 satisfying
the above conditions we have that for each amount of effort there is an almost everywhere unique opti
mal allocation of that effort, then any incrementally optimal sequence is totally optimal. Although not
stated as such, this result is proven in [2].
Example 2.3 shows, of course, that optimal allocations need not be unique. Even when E and C are
defined as integrals with respect to Lebesgue measure on rcspace as is done in section 1, optimal
allocations need not be unique. In fact, it is easy to see that if there exists a subset D of R having posi
tive measure such that for xeD the graph of e{x, •) contains a nondegenerate straightline segment of
slope A., then there are amounts of effort for which an optimal allocation of that effort is not almost
everywhere unique.
REMARK 2.4: Let us return to the search situation described in section 1. That is, L(x)=0,
U(x) = 00 for xeR, e(x, y) =f(x)b(x, y) for(x, y)efi. Suppose that an optimal allocation q\ has been
performed and that the search has failed to detect the target. Let f\ be the posterior target location
density given failure to detect the target. Thus
(2.7) /,(*)= lE( qi )
For xeR, let bi{x, •) be the conditional local effectiveness function at x given that q\{x) search effort
density was placed at x and the target not detected. Then
<<yn\ u t \ &(*,<? ! (*) + y) ~b(x, qi(x))
(28) M*»y)= i — 77 TV\
1 — b{x, qi(x))
Suppose that h is an allocation of effort which is added onto the original allocation </i, so that the re
sulting total effort density is q\{x) + h(x) for xeR. Then
(2.9) EAh)=f f 1 (x)b l (x,h(x))dx
is the conditional probability of detecting the target given that allocation qi failed.
Fix an increment of effort T. Suppose h* has the property that f R h*(x) dx=T and
E l (h*) = max f £,(M:h^0and f h(x)dx=T\
424 L. D. STONE
Then h* is sometimes called a conditionally optimal search. If we let qz = qi + h*, then we claim
(<7i, (72) is an incrementally optimal sequence. To see this, we observe that by (2.7) and (2.8),
ti(h) — — —
1 E{q x )
Thus maximizing E\ subjected to h 2* and I h(x) dx = T is equivalent to maximizing E subject to
q*zq\ and C{q) — C{q\) + T. The claim now follows from the definition of incremental optimality,
and we see that the concepts of incremental and conditional optimality coincide for searches of the type
discussed in this paper. Hence, under the conditions of Theorem 2.1 or 2,2, a sequence of conditionally
i
optimal searches (hi, h 2 , . . .) produce, by setting qt = >) n k a totally optimal sequence (qi, q 2 , . . •)
fc=i
of search allocations.
3. EXISTENCE THEOREM
In this section we find conditions under which uniformly optimal search plans or allocation sched
ules exist. More precisely let J be an interval of real numbers. Then an allocation schedule over J is a
Borel function 7} defined on R X J such that for TeJ, tj(, T)e<P and for a.e. xeR, T)(x,) is increasing.
An allocation schedule tj is uniformly optimal if
(3.1) C(i)(,T)) = T and E(r)(,T)) = max{E(q) :C(q) *iT} {orTeJ.
This definition is a generalization of the definition of uniform optimality for search plans given by
Arkin in[l]. In the special case where E(q) gives the probability of detection resulting from the allocation
of search effort, q, we call 17 a search plan. Then a uniformly optimal search plan maximizes the probabil
ity of detection at each instant during the search. In order to prove the existence of such allocation plans
we define a notion of coverability similar to the one in [7].
Suppose p is a real valued function defined on an interval J of real numbers. If p is concave, then
throughout the interior of its domain, p' exists a.e. and is decreasing. Moreover, if p is continuous, then
p(t) — p(s) = I p' (r)dr for s, tej. By an extreme point of a concave function p, we mean a point on
its graph which does not lie on a chord joining two other points on the graph.
Define m(x,) to be the minimal concave majorant of e(x,') for all xeR for which such a majorant
exists. We say that m covers e if the following conditions are satisfied,
(i) For a.e. xeR, m(x,) exists and is continuous,
(ii) m is a Borel function.
(iii) e(x,y) = m(x, y) whenever (y, m(x,y)) is an extreme point of m(x, •)•
Note that condition (iii) is equivalent to assuming that e(x,) is upper semicontinuous at y such that
(y, m(x,y)) is an extreme point of m(x,). For ge<P we define
M(q) = \ m(x, q(x))dx
Jr
whenever the integral on the right exists.
INCREMENTALLY OPTIMAL ALLOCATIONS 425
Differentiation is always with respect to the last component of the argument, and is denoted by a
prime, e.g., for (x, y)eil, e (*, y) = lim [e(x, y+ h) — e{x, y)]lh. Let m cover e. If a function </€<P and
A»0
a real number k satisfy, for a. e. xeR ,
m (x, y) ^ k for a. e. y such that L(x) < y < q{x)
m' (x, y) *S: \ for a. e. y such that q(x) < y < U{x) ,
then we say that the pair (q, k) satisfies the NeymanPearson inequalities. When e(x,) and m(x,) are
increasing and U(x) = °°, it is convenient to define
e(x, °°) =lim e(x, y) and m(x, °°) =lim m(x, «>).
Before proceeding with our main existence result, we prove two lemmas which will be useful in
this section. Lemma 3.1 relates closely to Theorem 1 and Remark 3 of [5] .
LEMMA 3.1: Let m cover e. If there is a q*e<$> such that E(q*) > — °° and k S* such that for
a. e. xeR
(3.2)
then
(i) m'(x,y) 5* k for a. e. y such that L(x) < y < q*{x)
(ii) m' (x, y) ^ k for a. e. y such that q*(x) < y < U(x)
(Hi) e(x, q* (x)) = m(x, q*(x)),
(3.3) E(q*) = max {E(q) : C(q) ^ C(q*)}.
PROOF: By (3.2) (iii), M(q*) exists. It is an easily shown NeymanPearson result (see Theorem
1 of [5]) that for k 5* 0, (i) and (ii) imply
(3.4) M(q*) = max {M(q):C(q) ^ C(q*)h
Suppose that there is an re<I> such that E{r) > E{q*) and C(r) *£ C(q*). Since m majorizes e, we have
M(r) ^E(r) >E(q*) = M(q*),
which contradicts (3.4). This proves the lemma.
For k > and x such that m(x,) exists, we define
if u (x, k) = sup {y : y = L (x) or m! (x, y) 3* k)
<Pf(x, k) = inf {y.y = U{x) or m' (x, y) =£ k }.
Then for k > 0, we let
l,(k)= I <Pf(x, k)dx, I u (k) = \ <p u (x,k)dx.
426 L. D. STONE
The functions ip ( and <p u will be our main tools for constructing solutions to the constrained extremal
problems considered here. The following lemma displays some of the properties of these functions.
I ..EMMA 3.2: Suppose m covers e and for a.e. xeR, e(x,) is increasing. If— « < E(L) =£ E(U)
< oo and C(L) < °°, then the following hold:
(a) (£>„(•, X)e<I> and tp f (, X)e<I> for X > 0.
(b) /^,and /„ are finite and decreasing.
(c) <ps(x, •) and I ( are right continuous and <p u (x, ■) and /« are left continuous.
(d) ((f{(, X), X) and (<p u {, X), X) satisfy the NeymanPearson conditions.
(e) A pair (q, X) , where <?e4> and X 2= 0, satisfies the NeymanPearson inequalities if, and only if,
<p/(x, X) =£ q(x) *£ <p u (x, X) for a.e. xeR.
(f) For any X > 0, we may find a Borel function a defined on R X [/^(X), / U (X)], such that
(1) a(x, •) is increasing for a.e. xeR,
(2) C(a(, T)) = TforI, (X) < T*£ J U (X),
(3) (a(, T) , X) satisfies the NeymanPearson inequalities for all/^(X) =£ 7 1 =£ / U (X).
(g) lim /„(X)=C(L).
(h) For X > and x such that m(x, •) exists, (<p u (x, X), m(x, <p u {x, X)), and (^>,(x, X),
m(x,<p ( (x,\)) are extreme points of m(x, •)•
PROOF: A straightforward verification shows that (£v(\ X) and <£>„(•, X) are Borel functions for
each X > and that (a) holds. Thus, the integrals, //(X) and 7 M (X) are well defined for each X > 0.
For a.e. xeR, the following hold. Since e(x, •) is increasing, m{x, •) is increasing. If U{x) is finite,
then (U (x), m(x, U(x))) is an extreme point and m(x, U(x)) — e(x, U{x)). If U(x) =», then the in
creasing nature of e(x, ■) and the minimal nature of m(x, •) yields m(x, o°) — e(x, °°). Since C"(L) < «»,
L(x) is finite and m(x,L(x)) = e(x,L(x) ).
To prove (b) , we observe that
 oo < E(L) = M(L) *s £(£/) = M(U) < oo.
Thus, for a.e. xeR, m{x, L(x)) and m(x, U(x)) are finite.
Since m(x, •) is increasing, we have for a. e. xeR,
m(x, U(x))m(x,L(x)) 2* f* m'{x,y)dy^ (z L(x)) m' (x, z) for L(x) <z<U(x).
Thus, m (x, z) =£ (m(x, U(x)) m(x, L(x)))l{z L(x)), and it follows that
ip u {x, X) ^f [m(x, U(x))m(x,L(x))]+L{x) forX>0.
X
Hence,
1
A
oo <C(L) ^^(XJ^/^X) *£ 7 [M(£/)M(L)]+C(L) <oo forX>0
INCREMENTALLY OPTIMAL ALLOCATIONS 427
which proves that If and I u are finite. The decreasing nature of m' (x, •) for a. e. xeR guarantees that
^ M (*» *) SLndtp/ {x, •) are decreasing for a. e. xeR. Thus, (b) follows.
The left continuity of <p u (x, ') and the right continuity of <p({x, •) for a. e. xeR follow from their
definitions and the decreasing nature o( m' (x, •). The monotone convergence theorem and the finite
ness of /« and / e may be used to show the left and right continuity of /* and // , respectively. Thus, (c)
holds.
Properties (d) and (e) follow directly from the definition of <p{ and <p u . In order to prove (f), we use
a device of Arkin's [1] and define for =£ s =£ °°
((fu(x, \) if * < o
<p,(x, A) if « 35 5
and
Hk(s) = /ik(x, s)dx.
JR
By the monotone convergence theorem, H\ is continuous. Moreover, H\ is increasing and
//x(0)=MA), //x(»)=/ u (X).
Thus for //(A.) ** T «£ I u (k), we may choose £(T) such that H\(t;(T)) = T. Defining a(x, T)
— h\(x, i(T)) for xeR, we see that a satisfies conditions (1) and (2) of (f) Condition (3) follows from
(e). Property (g) follows easily from the monotone convergence theorem and the definition of <p u . Prop
erty (h) may be verified from the definitions of <ps, <p u , and an extreme point. This completes the proof.
THEOREM 3.3: Suppose m covers e, and for a. e. xeR, e(x, •) is increasing. If — °° < E(L) < E(U)
< °° and C(L) < °°, then there exists a uniformly optimal allocation schedule tj over [C(L), C(U)).
PROOF: We consider first the case where / M (0) = lira I U (X)=C(U). The case 7„(0) < C(U)
requires only routine modifications which are discussed at the end of the proof. We take7j(, C{L)) =L.
Since I u is monotone, it has only a countable number of discontinuities. Let K be a countable
index set such that {kk'.keK} is the set of discontinuity points of /„. Let J k — U/>(kk), Iu(kk] for
keK. The intervals J k are disjoint and are the jump intervals at the discontinuity points of /„. For
T € (C(L),C(U)) U 7*,let
ktK
X*(7) = sup{\ :I u (k) = T}.
By the left continuity of /„, I u (k*{T)) = T.
For TcJk, let X*(7') = A* and choose a function a* defined on R X/ t to have the properties of
a in (f) of Lemma 3.2. Define
r <p u (x,\*(T)) \iC(L)<T<C{U) andT* uA
l atk(x, T) if TeJk and keK.
428 L. D. STONE
Then for each C(L) <T< C(U),C(t)(, T)) = T and (t/(, T), X*(T)) satisfies the NeymanPearson
conditions. Since m covers e and property (h) of Lemma 3.2 holds, we have that for each C(L) < T
< C{U), e{x, 17 (x, T)) = m{x, t}(x, 7')) for a.e. xeR. Thus, by Lemma 3.1, 17 satisfies 3.1.
To verify that 17 (x, •) is increasing for a.e. xeR, we let R' be the set of xeR such that m(x, •)
exists. Then by the fact that m covers e, R — R' has measure 0. Suppose it is not the case that rj(x, • ) is
increasing for a.e. xeR. Then there is an xeR' and numbers S and T, such that C(L) < T < S < C(U)
and
(3.4) r ) (x,S)<r ) (x,T).
Since (tj(, T), k*(T)) and (tj(, S), \*(7')) satisfy the NeymanPearson inequalities for all xeR' ,
we have
A*(7) *£ m' (x, y) ^ X*(S) for a.e. y such that q(x, S) < y < f){x, T).
One may check that A* is a decreasing function, so that A* (7*) = A*(S). Thus, for some keK, T and
S are both in the closure of Jh, However, w(x, •) is constructed by property (f) of Lemma 3.2 to be
increasing on the closure of A. This contradicts (3.4) and proves the theorem for the case where
I u (0)=C(U).
If/„(0) < C(U), we proceed as before for C(L) < T < /«(0). We then define
<p tt (x, 0) = lim <p u (x, A).
x»o
From the increasing nature of e(x, •) , it follows that if gc4> and if q(x) ^ <p u (x, 0) for x€R, then q will
satisfy (3.2) with A = 0. Hence, to complete the definition of 7)(x, •) for 7«(0) < T < C(U), one need
only choose 17 so that 7}(x,T) 3* <p„(x, 0) and C(t/(, T)) = T which may be easily done. This completes
the proof.
Observe that the hypotheses of Theorem 3.3 may be weakened to require that m(x, •) rather than
e{x, •) be increasing for a.e. xeR. The theorem remains unchanged except that 17 must be restricted
to [C(L),/„(0)]. This is no real restriction since for q ^ (p u {\0),E(q) =£ E(<p„(,Q)).
Theorem 2 of Arkin [1] is similar to Theorem 3.3 above with the exception that [1] claims that
there exists a function fi such that
17U, T) =Vp(x,s)ds, C(t,(,T)) = T,
and 17 is uniformly optimal. However, the following is a counterexample to Theorem 2 of [1]. (More
over, the proof in [1 1 is not sufficient to show the truth of Theorem 3.3.)
Let/?= [0, 1],L = 0, e/ = oo, and
0«r<l
y?l
for xeR. It is clear that any uniformly optimal search plan 17 must have the property that for a.e. xeR,
■q(x, •) jumps from to 1 at some point T, but there is no function /3 which produces this behavior for 17.
INCREMENTALLY OPTIMAL ALLOCATIONS 429
Under the conditions of Theorem 3.3. we have shown that there exists, for any C(L) ^T< C(U),
a q* such that C{q*) = T and E(q*) = max {£(</) : ge<P and C(q) =£ T}. Theorem 8 of [7] provides a
similar existence result whenever m covers e and — °° < C{L) =£ C(U) < °°. In comparison, Theorem
3.3 of this paper removes the restriction that C{U) < °°, but adds monotonicity conditions on e(x, •)
and boundedness conditions on E. In [6] there is also a discussion of related existence theorems.
One might conjecture that Theorem 3.3 would remain true without assuming that e(x, •) is in
creasing, provided that we assumed \E{q) \ < B for some number B and all qe<5>. Similarly, one might
conjecture that the restriction C{L) > — oo could be omitted. However, the following two counter
examples show both of these conjectures to be false.
EXAMPLE 3.4: Let« = [1, °°), L = 0, and U(x) =x+ 1/x 2 for xeR.
For xeR, define
r y, *£ y =£ 1/x 2
e(x, y)= <
1 y1/* 2
r 2 v3
l/* 2 <ys=*+l/* 2 .
Note that m = e and that \E(q)  ^ 1 for all qe<P. Suppose q* is optimal and
°o>C(<7*) >1.
By Corollary 7.2 of [6] there exists a X such that
(3.5) e(x,q*(x))kq*(x) = sup{e(x,y)\y:0^y<x+llx 2 } for a.e. xd?.
Since e(x, •) is concave for xeR, this implies
e'(x,y)^k iorO<y<q*(x)
e'(x, y) =£ X for q*{x) < y < x + Ijx 2 , for a.e. xeR.
One may check that if X ^ 0, C(q*) ^ 1. Thus, the above X must be negative. It follows from the
above inequalities that q*{x) = x + 1/x 2 for x 3 > — 1/X. Thus, C(q*) = °° which contradicts our assump
tion that C(q*) < <». Thus, one cannot replace the monotonicity of e(x, •) by boundedness of E in
Theorem 3.3.
EXAMPLE 3.5: Let ft = [1, °°), L =  1, tf= l.and
e(x, y) = ylx 2 for — 1 =£ y *£ 1.
Observe that e = m and all the conditions of Theorem 3.3 are satisfied except that C(L) = — ». Suppose
q* is optimal and C(q*) is finite. Again by Corollary 5.2 of [6] there exists a X such that
e{x, q*{x)) — kq*(x) = sup {e(x, y) — Xy: — 1 *£ y ^ 1} for a.e. xeR.
430 L. D. STONE
Hence,
It follows that
e'(x,y)&\, Ky<q*(x)
e'(x, y) «S k, q*(x) < y < 1, for a.e. xdl.
1 for a? < 1/X
 1 for x 2 > 1/X.
q*(x) = l
Hence, either C(q*) = — » or C(q*) = + oo contrary to the assumption that C(q) is finite. Thus, we
cannot omit the condition C(L) > — °° in Theorem 3.3.
REMARK 3.6: In the search theory case where L(x) = 0, U(x) — °° for xeR and e(x, y) =f(x)
b(x, y) for (x, y)efl, the conditions of Theorem 3.3 will be satisfied if b{x, •) is right continuous for
xtR. Since b(x, •) is increasing and \E(q)  =£ 1, for ge4>, the only condition that is not obviously sat
isfied is the coverability condition. However, since b(x, • ) is increasing and rightcontinuous, it is upper
semicontinuous. Thus, e(x, •) has a minimal concave majorant m(x, •) which is continuous, and one
may check that e(x, y) = m(x, y) whenever (y, m(x, y)) is an extreme point of m(x, •). It can be
shown that since e is Borel, m is a.e. equal to a Borel function. Thus the conditions for coverability
are satisfied. It follows that whenever the local effectiveness function is right continuous and the
target location distribution is given by a density function on Euclidean respace, a uniformly optimal
search plan exists. Note that uniformly optimal search plans may be used to produce sequences
which are both incrementally and totally optimal.
REFERENCES
[1] Arkin, V. I., "Uniformly Optimal Strategies in Search Problems," Theor. Probability AppL 2,
674680 (1964).
[2] Dobbie, James M., "Search Theory: A Sequential Approach," Nav. Res. Log. Quart. 4, 323334
(Dec. 1963).
[3] Everett, Hugh, "Generalized Lagrange Multiplier Method for Solving Problems of Optimum
Allocation of Resources," Operations Res. 77, 399417 (1963).
[4] Koopman, B. O., "The Theory of Search: HI. The Optimum Distribution of Searching Effort,"
Operations Res. 5, 613626 (1957).
[5] Wagner, D. H., "NonLinear Functional Versions of the NeymanPearson Lemma," SIAM Rev.
7, 5265 (June 1969).
[6] Wagner, D. H. and L. D. Stone, "Necessity and Existence Results on Constrained Optimization
of Separable Functionals by a Multiplier Rule," To appear in SIAM J. Control, 12, (1974).
[7] Wagner, D. H. and L. D. Stone, "Optimization of Allocations Under a Coverability Condition,"
To appear, SIAM J. Control, 12, (1974).
[8] Zahl, S., "An Allocation Problem With Applications to Operations Research and Statistics,"
Operations Res. 77, 426441 (1963).
AN APPROACH TO THE ALLOCATION
OF COMMON COSTS OF MULTIMISSION SYSTEMS*
Robert Thomas Crow
School of Management
State University of New York at Buffalo
ABSTRACT
Many Naval systems, as well as other military and civilian systems, generate multiple
missions. An outstanding problem in cost analysis is how to allocate the costs of such mis
sions so that their true costs can be determined and resource allocation optimized. This
paper presents a simple approach to handling this problem for single systems. The approach
is based on the theory of peakload pricing as developed by Marcel Boiteux. The basic
principle is that the longrun marginal cost of a mission must be equal to its "price." The
implication of this is that if missions can cover their own marginal costs, they should also
be allocated some of the marginal common costs. The proportion of costs to be allocated is
shown to a function of not only the missionspecific marginal costs and the common marginal
costs, but also of the "mission price." Thus, it is shown that measures of effectiveness must
be developed for rational cost allocation. The measurement of effectiveness has long been
an intractable problem, however. Therefore, several possible means of getting around this
problem are presented in the development of the concept of relative mission prices.
THE PROBLEM
This paper is an attempt to provide a new method of allocating the common costs of new invest
ments in a multimission system to individual missions in a way that is (1) operational, (2) objective,
and (3) defensible from the point of view of efficient resource allocation. t The most important reason
for allocating common costs to individual missions is to provide guidance for procurement of systems
and to estimate the costs of accomplishing given missions by alternative systems. If common costs are
properly allocated, it is possible to estimate the true costs of accomplishing a mission with one system
compared to another.
For an illustration of the problem, consider Table 1. Five systems are shown that can be combined
to accomplish three missions. Systems A, B, and C are singlemission systems and therefore, by defini
tion, have no common costs. Systems D and E are multimission systems characterized by significant
common costs, as well as incremental costs which are specific to each mission. How much does it cost
to accomplish the missions by the use of multimission systems? Which systems— singlemission,
multimission, or some combination — should be procured? Obviously, military systems problems
are too complex to be accurately characterized by such simple questions. Yet it appears that in dayto
*The work on which this article is based was performed for the Chief of Naval Operations, Systems Analysis, as part of
Contract No. N0001470C0086 with Mathematica, Inc. This paper is a revision of portions of a report [4] prepared for that
contract.
t Common costs are defined as those which are incurred by a single system regardless of which of a number of missions is
being performed. They may arise from either operation or investment. It should be clear that, since the subject is investment
decisions, it is the incremental common costs that are to be allocated.
431
432 R t. crow
day operations, they are often posed in this fashion — for firstapproximation purposes, if not for final
decisions.
TABLE 1. Comparison of Costs of Single and MultiMission Systems
Costs
Systems
A
B
C
D
E
—
—
—
50
60
65
—
—
20
—
—
45
—
30
30
—
—
60
—
10
Totals
65
45
60
100
100
As an example of the consequences of misallocating common costs, if in a given system all common
costs are allocated to a particular important mission, it may appear that the system's capability in
that mission is costly relative to other systems and it may not be purchased. Even if it is purchased,
it may be underutilized. On the other hand, other mission capabilities of the system may appear to
be less costly than they are in fact, leading to overpurchase or overutilization.
In general, the allocation of common costs has been avoided by economists. Most work on common
costs has focused on shortrun marginal cost analysis and hence only on the costs that are variable
for each specific output (or mission, in our context).* This focus on shortrun problems sidesteps
the issue of how to handle common costs in investment decisions, where there are normally several
alternative courses of action and no costs are fixed. For pricing and other types of decisions, in both the
private and public sectors, if other than short run marginal or incremental prices are considered,
reliance has usually been placed on arbitrary allocations of common costs to some or all outputs. This
was the general rule until 1949, when Marcel Boiteux wrote an ingenious and basic article on peak
load pricing for electrical utilities, translated into English in 1964 (Ref. [2]). In the approach presented
here, two principles will be emphasized: (1) (following Boiteux) efficient allocation of common
costs can and must be based on "marginal" conditions; and (2) efficient allocation must be based
explicitly on considerations of how well a given system performs a given mission, as well as an assess
ment of the need for capability in that mission.
SOME BASIC PRINCIPLES
A basic theorem of resource allocation is that in a competitive economy the maximation of output
in equilibrium occurs when price (P) equals the cost of producing one additional unit of output (mar
ginal cost or MC). The P = MC output can be proven to be optimal if competitive conditions hold
throughout the economy (assuming the existence of Ushaped average cost curves). This is a funda
mental principle in the ensuing discussion on cost allocation.
*For an example, see Ref. [3, ch. 5]. Professor William Baumol has pointed out in correspondence that there have been a
number of recent exceptions to this general rule for civilian systems, such as utilities. In particular, see Ref. [6].
COST OF MULTIMISSION SYSTEMS 433
Furthermore, it is necessary to recognize that before any costs are sunk, they are all variable.
Thus, in planning for investment, the criterion that price must equal marginal cost means that the
marginal cost measure must include investment cost. Assuming for the moment that a set of "prices"
can be established for the value of accomplishing a military mission, the widely used criterion of price
equal to shortrun marginal costs (marginal operating costs) is only valid as the minimum price at
which a particular system should be used in a given mission. In other words, for a given system, if the
mission is not needed sufficiently to be worth sacrificing enough resources to pay for its marginal
operating costs, then the mission should not be performed.
An example of the possible consequences of using only shortrun marginal costs in system decisions
is shown in Figures 1 and 2. The amount of output of each two systems is q and the vertical distances
Of i and Of 2 represent investment costs, while the vertical distance of the shaded areas represent
marginal costs. In this case, it is clear that shortrun marginal costs are lower for system 2 than system
1; however, in the long run, the total cost of system 1 is less because its investment costs are lower.
Therefore, it should be chosen in spite of its higher shortrun marginal costs.
PRICE, "
COST
QUANTITY
Figure 1. Output with high marginal operating costs and low marginal investment costs, System 1.
PRICE,
COST
Of;
QUANTITY
Figure 2. Output with low marginal operating costs and high marginal investment costs. System 2.
434 R. T. CROW
In order to include investment costs in the marginal cost measures, it is necessary to distinguish
longrun from shortrun costs. Longrun costs represent situations where output is variable through
investment, rather than by changing utilization of existing capacity (which can occur over the short
run). The question is how to reconcile the shortrun condition of price equal to marginal cost with
the necessary condition of longrun marginal costs being covered. The solution is deceptively simple:
for a given objective to be achieved, purchase that number of units of the system at which the price
of each unit equals both the shortrun and longrun marginal costs. It is necessary to demonstrate this
assertion.*
Real systems often are relatively inflexible in their capability. That is, after a point, significant
increases in operating costs will expand output very little, i.e., the cost curves have an "elbow" where
they become vertical when capacity is reached. It can be proven that for a family of such shortrun
curves, representing either expanded units or additional units, the envelope of the shortrun curves
is the longrun totalcost curve. This is illustrated in Figure 3. This is the case that will be dealt with
LONG RUN
q, q 2 q 3 q 4 q 5 q 6 q 7
Figure 3. Relationship of longrun to shortrun total costs.
here in demonstrating that optimum resource allocation calls for equality of short and longrun mar
ginal costs. t In the ensuing discussion it will be useful to identify shortrun costs with operating costs,
and longrun costs as operating plus investment costs. In dealing with a system with very little flexi
bility, we are able to illustrate a means of approximating the solution for the case in which the system
is completely inflexible.
The following notation will be used:
C: total cost variable over the short run (only operating costs)
C : total cost variable over the long run (operating plus investment costs)
Cf. total investment cost (fixed in the short run)
*A rigorous treatment is presented in an appendix to Ref. [2].
tThis can be relaxed, but at the sacrifice of simplicity in exposition a.. J application. See appendix to Boiteux [2].
COST OF MULTIMISSION SYSTEMS 435
q: output of a system
q : capacity which meets some given requirement
z: marginal operating cost (variable in the short run) = dCjdq
x: marginal investment cost (variable in the long run) = dC/ldq
y: marginal long run cost = dC /dq
w: an approximation of z for q< q a , defined by (C — C/)lq, i.e., the average avoidable cost.
For q < q , let the total cost function be*
(1) C = C f +wq.
Over the long run, adjustment can be made to the required capacity (i.e., investment can be varied).
Therefore (1) can be expressed as
(2) C = Cf(q ) + wq ,
and, assuming that w does not vary with the number of units,
a\ dC dC f
(3) = (q ) + w = y.
dq dq
That is, the longrun marginal cost of a number of units is
(4) y=x + w,
by the definitions of y, x, and w, i.e., the longrun marginal costs are equal to the sum of the marginal
investment cost and the average avoidable cost.
If there is true inflexibility at q , shortrun marginal cost is indeterminate. However, if there is a
slight bit of flexibility, z becomes very large as q — * q* Therefore, it becomes equal to y at some point on
its vertical arm. At this point, longrun and shortrun marginal are equal, which establishes the solution
of price equal to longrun and shortrun marginal costs,
(5) p = y=x + io=z.
This is illustrated in Figure 4.
The Basic Principle of Common Cost Allocation
Assume that a given system can perform two missions, and its investment cost is common to both.
That is, there are no characteristics of the system that can be attributed uniquely to one mission or
another.*
* This section follows Boiteux, [2]closely. Any errors are likely to be mine, not his.
t Linearity is assumed for convenience and because in applied work one often has only linear approximations in any case. The
linearity assumption does not appear to have a material effect on the analysis.
436
R. T. CROW
Figure 4. Equality of shortrun and longrun marginal costs at capacity output.
In the case of two missions being performed by a single system, the cost functions for each of them,
with flexible capacity, are
(6a)
and
(6b)
C, = C/(q ) + Wiqi
C 2 = Cf (q ) + w 2 q 2 ,
where q\ and q 2 represent the output of missions 1 and 2, respectively. Given a particular output
requirement, say q u the requirement for optimum resource allocation is that the marginal longrun cost
for both missions together be minimized, i.e.,
(7)
a(C + c 2 )
dq
= 0.
To establish the equality of long and shortrun marginal costs, consider the differential of C t + C 2 ,
(8)
d{Cl + Ci) = ~dq~ t qi + ~dQ~2 dq2 + 2 BQo dQ °
The first two terms on the right hand side are shortrun marginal costs, z\ and z 2 , times their respective
variations in output. The third term must be equal to zero for cost minimization to hold.
The differential d{C\ + C 2 ) may also be written as
(9)
dC\ + dC 2 =  — dq + Widqi +  — dq + w 2 dq 2
loq J ldq J
COST OF MULTIMISSION SYSTEMS
where dq x — dq% = dq * (9) may be divided by dq to yield
437
(10)
dCi + dCi n dC/
j = 2~ h W\ + W2 .
dq aq
Since dCf/dq has been defined above as x, the optimum condition of equality of short and longrun
marginal costs is
(11)
z\ + z% — 2x + Wi + wi.
For the prices of two missions to be equal to the longrun marginal costs of a given system capacity,
(12)
Pi + Pi. = 2x + w\ + w 2 .
The allocation of common investment costs follows immediately from (12) which implies,
(13)
(pi  w x )l2x + (p 2  w 2 )/2x = 1.
That is, the share of common costs to be allocated to each mission performed by the system is equal
to the corresponding term in (13).
Before turning to the problem of how the prices of missions are to be determined, it is instructive
to consider two particular cases. One is the case where one of the missions (mission 2, for example)
has a price such that it is just worth its marginal operating costs, pz = w. In this case all common
costs are allocated to mission 1 by (13). This is illustrated in Figure 5.
FORCE
q(;q ) LEVEL
Figure 5. One system bearing all common costs and the other system bearing only operating rosts.
"Boiteux [2, appendix] has shown that this is likely to be a reasonably good approximation over a wide range of conditions.
438
R. T. CROW
It is important to note that this is a generalization of the "twoship" method of allocating invest
ment costs currently in use (Grey, [5] pp. 2, 3, sec. 24). In this method, "major" missions are al
located the entire common investment cost of the system plus their incremental investment costs, while
"minor" missions are allocated only their incremental investment costs.
In the case shown in Figure 6, the problem is different in that each mission's price is such as to be
able to cover at least a portion of its common marginal investment costs. In this case, the question
arises: Is one mission allocated only operating costs and the other allocated both its operating and the
common investment costs? And further, if the common investment costs are shared, how are they di
vided? The answer is that since pi and pz both exceed w (assumed to be the same for both missions to
keep the diagram simple), common costs must be allocated according to (13), where both terms will be
greater than zero. Thus each mission will bear a share of the common cost.
COST,
PRICE
P
;
;
^— ""■"""o
P 2
o
2y
q(=q )
Figure 6. Both systems bearing common costs.
FORCE
LEVEL
THE PROPOSED APPROACH TO PRICE DETERMINATION AND COST ALLOCATION
Military costeffectiveness analysis generally takes one or another of the following forms: (a)
maximize effectiveness subject to a cost constraint, or (b) minimize cost subject to achieving a given
level of effectiveness. In this paper, attention will be devoted to the latter, but the basic approach is
also applicable to the former.
As seen from the discussion above, it is critical to be able to set the prices of various missions.
At first, this requirement appears unorthodox and difficult; but, in fact, it is not far from existing prac
tice. In the case of a single system, the price of one unit of the system (say one plane), plus the wages of
personnel, the expenditure for fuel, and whatever else is necessary for the unit to perform its mission,
can be considered to be the price of the mission. Expressed somewhat diffc ently, the amount of ex
penditure necessary to meet a given mission requirement divided by the units of a system which must
be used is the price of that mission as performed by a particular system.
COST OF MULTIMISSION SYSTEMS
439
The Alternative System Method
Consider two singlemission systems which perform different missions. The expenditures on these
two systems are shown in Figures 7 and 8. In order to meet effectiveness requirements for each, r/i
and<?2 units are required respectively. To meet these requirements, expenditures oiOpiCiqi and OpzCzqi
are necessary, implying mission prices of pi andp2
If a third system performed both missions jointly, it might be procured in such quantity (93) that
it met the requirements of mission 1 and part of the requirements of mission 2. This is shown in Figure 9.
If we assume that the operating costs of system 3 are identical to those of systems 1 and 2, and the in
vestment costs are exactly the same for system 3 as for the sum of systems 1 and 2, then, at the optimum,
the prices of the missions for the multimission system are equal to those of the singlemission systems.
Pi + P2 = 2 (x 3 + w).
This follows from (12), and allocation of common costs (investment costs in this case) follows (13).
That is, in this particular case the shares of common costs differ solely because of the difference in
mission prices.
pi
C
Q FORCE
LEVEL
FIGURE 7. Average price and required units of system in Mission 1 (alternative system method).
^2
FORCE
LEVEL
FIGURE 8. Price and required units in Mission 2 (alternative system method).
440
R. T. CROW
COST,
PRICE
Figure 9. Costs and prices of both missions with a single system.
Clearly the assumptions of identical costs is unrealistic. There is no reason to assume that a multi
mission system will cost precisely the same as the sum of two singlemission systems. In fact, if it did
there would be no apparent reason to buy it. The only apparent reason for a multimission system is
that it meets a set of requirements less expensively than singlemission alternatives. Thus, to preserve
the equality of prices and marginal costs for the multimission system, it is necessary to scale either the
marginal costs or the prices. Since it makes no difference which is chosen, prices are scaled:
P\3 + P23 = n(p l +p 2 ) = 2(x 3 + w),
where P13 and p 2 3 are the prices of missions 1 and 2 as performed system 3, and
(14)
n =
2(x 3 + w)
(Xi + W) + (X 2 + W)
from the conditions
Pi = x\ + w and p2 = X2 + w,
established in (5). Thus, the proportions of common investment cost (for example) allocated to different
missions are the respective terms of
(npi — w)l2x 3 + (np 2 — u,l2x3 = l.
Next, consider the systems where certain investment and operating costs can be attributed to
specific missions. The following notation will be used:
P13 = price of mission 1 performed by system 3
COST OF MULTIMISSION SYSTEMS 441
P23 = price of mission 2 performed by system 3
u>i3 = an approximation of marginal operating costs for mission 1 performed by system 3 (see
definition of w prior to Equation (1)).
W23 = an approximation of marginal operating costs of mission 2 performed by system 3.
Wz = an approximation of marginal operating costs of system 3 common to missions 1 and 2.
2 13 = marginal operating cost of mission 1 performed by system 3
223 = marginal operating cost of mission 2 performed by system 3
*i3 = marginal investment cost of mission 1 performed by system 3
£23 = marginal investment cost of mission 2 performed by system 3
x 3 = marginal investment cost of system 3 common to missions 1 and 2
The critical condition for optimum resource allocation in the simple case presented in (11) was
z\ + zi = 2x + w 1 + wi.
If there are investment costs and operating costs common to each mission, as well as investment costs
and operating costs specific to both missions, the specific costs are directly attributable to their respec
tive missions. The condition can be written as
Zl3 + Z23 = 2(*3 + W 3 ) + *13 + W13 + *23 + ^23,
which preserves the equality of short and longrun costs for the system. Since the condition for optimum
resource allocation is that output is such that price equals marginal cost,
P13 = Z13 and p 2 3 = Z23.
Therefore,
(15) Pl3 + P23 = 2 (X3 + ">3) + *13 + Wl3 + *23 + ^23
and
(Pl3 — W13) + (P23 — *23 — M>23) = 2(* 3 + m)
which implies that allocation of common investment and operating costs follows
(16) [(Pl3*13 Wl3)/2(X 3 + M;3)] + [(P23*23M>2 3 )/2(x 3 + U;3)] = 1.
Of course, in the case where pi andp 2 are given from singlemission systems, then
P13 = npi and p 2 3 = "P2
from (14).
442 R T. crow
Knowledge of Historical or Simulated Tradeoffs
If historical data, e.g., Viet Nam, Korea, or World War II, is relevant to the missions in question,
perhaps some tradeoffs can be established. Force structure analysis might also be useful in establishing
such tradeoffs. For example, suppose that as a result of such analysis, a tradeoff could be established
such that the outcome of a campaign would have been the same regardless of whether an amount of
"output" mi of mission 1 or mi of mission 2 were provided. Therefore the price of mission 2 relative to
mission 1 is
P\ m 2
or
Pi = kpi.
The price of mission 1 performed by system 3 must be such that (15) will hold, that is
P 13 + kp 13 = 2 {X 3 + W3) + X 13 + «>13 + *23 + ^23,
which implies
(17) Pl3= [2(*3 + W 3 ) + JC13 + W13 + X 2 3 + W23]/ (1 + k) .
Since p 23 = kp\ 3 , bothpi 3 andp23 are determined and allocation of common costs follows (16).
Expert Opinion
Expert opinion may be used to establish tradeoffs needed for common cost allocation. This appears
to be the implicit assumption underlying the twoship method, in which judgment must be made as to
which mission is "major" and which is "minor." The principal distinction is that under the proposal
of this paper, if two missions are believed to be approximately equal in importance, common costs would
be allocated according to relative prices, taking incremental investment costs and operating costs
into account, rather than allocating all or none of the common costs to each system. If one was thought
to be slightly more important for the system in question than the other, relative prices might be set such
that pilp2 = 55/45 and so forth. The setting of an absolute price for one mission would follow (17) and
the allocation of common costs would follow (16).
A Simple Example of Mission and System Comparisons
At the beginning of the paper, Table 1 presented the costs of achieving given amounts of output
in three missions by five different systems — three single mission and two multimission. To illustrate
the method, consider the basic problems of the paper: (1) which systems are least expensive in carrying
out their multiple missions, and (2) what are the costs of multiple missions supplied by a single system.
COST OF MULTIMISSION SYSTEMS
443
Since we have provided a singlemission system for each of our alternatives, we use the alternative
system method of determining mission price. The information of Table 1 and the results of allocation
are presented in Table 2. In the illustration, it will be assumed that the entries in the table are costs
per unit and that marginal investment and operating costs are constant. Thus, the entries are approxi
mations of marginal costs, i.e., the h/s and x's in the notation above. The functions, such as (6a) and
(6b) which would yield these parameters could be developed from statistical cost estimating, industrial
engineering studies, analogy or even expert opinion, and should reflect all of the sources of costs, e.g.,
equipment, fuel, personnel, etc. that make the system operational.
Beginning with System D, allocation follows from (14) and (16). That is,
tlD —
2XD + Wdi + WD2
(x + w)a + (x + w) B
100 + 20 + 30 , ^
or 65 + 45 =L36 '
TABLE 2. Marginal Costs of Missions 1 , 2, and 3 for all Systems Before and After Allocation of
Common Costs
Costs
Before allocation
After allocation
Systems
Systems
A
B
C
D
E
D
E
Common
50
60
65
20
55
Mission 2
45
30
30
45
49
Mission 3
60
10
51
and
n D (p A +p B ) = 2xi, + w Di + WD2 or 1.36 (65 + 45) = 100 + 20 + 30,
which implies
(nDPA — wdi)I2xd + {n D pB — wtn)\2xo =1, or 0.69 + 0.31 = 1.00.
That is, 69 percent of the common cost is borne by Mission 1 and 31 percent by Mission 2. The long run
marginal costs, with the common costs allocated to the specific missions, are therefore:
Ym = 0.69(50) + 20 = 55 and Y m = 0.31 (50) + 30 = 45.
The same procedure is followed for System E, where
2x E + w E 2 + WE* ^ 2(60) +30 +10 , , 9
UE {x + w) B +(x + w)c 45 + 60 lbA
444
and
R. T. CROW
{tiePb — wez)I2x e + (n E pc — weh)I2x e =1 or 0.32 + 0.68 = 1.
The shares of common cost allocated to Missions 2 and 3, 32 and 68 percent, are then applied to the mar
ginal common costs, and the sums of the allocated marginal common cost plus the incremental cost for
each system are
Y E 2 = 0.32(60) + 30 = 49 and Y E3 = 0.68(60) + 10 = 51.
Now that the costs have been allocated, let us consider the results. Taking System D first, we see
that its cost for performing Mission 1 is less than for the singlemission system, System A. Its cost of
performing Mission 2 is exactly the same as that of System B. Therefore, it would appear that System D
is worthy of consideration for procurement and deployment for Missions 1 and 2, since its cost in each
mission is, at worst, no greater than that of competing systems.
Turning to System E, we find that it is more expensive in Mission 2 than are Systems B and D, i.e.,
its incremental cost exceeds its price. On the other hand, it is less expensive than C in its performance
of Mission 3. Since its longrun marginal cost for Mission 2 exceeds that of B and D, System 3 will
never be used in that mission. This implies that all common costs are to be allocated to missions which
it does perform, Mission 3 in this case. However, this means that the longrun marginal cost of Mission 3
is now 70 instead of 51, and it too now exceeds the cost of the singlemission alternative. Thus, it will
never be used according to our simplified analysis and does not appear to be a good candidate.
This result has a paradoxical element in that even though Zs's marginal costs exceed those of the
single mission alternatives when considered separately, the sum of the marginal cost is lower than
the sum of the singlemission alternatives. This, of course, is due to all costs being allocated to Mission
3 since Mission 2 is not performed. Does the allocation of common costs thus lead us astray? Would
we not be more likely to make good decisions on systems use and procurement if we simply compared
total longrun marginal costs and ignore allocation? The answer in this particular example, at any rate
(and probably generally, too) is no. Consider the possible ways of accomplishing Missions 1, 2, and 3.
The combinations and their associated costs are presented in Table 3. The results of considering al
Table 3. Marginal Costs of Alternative Combinations of Systems Providing Three Missions
Combination
Marginal costs of systems
A
B
C
D
E
Total
A,B,C
65
45
60
170
A,E
65
100
165
C,D
60
100
160
D,E
100
100
200
ternative combinations is that the leastcost combination is C and D. In the case of a direct comparison
with Systems B and C alone, however, System E does have an advantage. In this case, what is implied is
that cost penalities be accepted in Mission 2 in order to retain lowcost capabilities in Mission 3. In this
COST OF MULTIMISSION SYSTEMS 445
case, the marginal cost of Mission 2 would be set exactly equal to the price of the alternative system,
reducing it from 49 to 45, and the marginal cost of Mission 3 would be raised from 51 to 55, i.e., four units
of marginal common cost would be reallocated from Mission 2 to Mission 3.
Common Cost Allocation, Time and Discount Rates
Investments in military systems, like other public and private systems, have particular useful
lives and are subject to particular rates of discount. Exactly how useful lives and discount rates are
determined are difficult problems in their own right and will not be discussed here. Suffice it to say that
useful lives are functions of wear and tear and obsolescence, and discount rates reflect the terms under
which the values of present and future costs and effectiveness are compared.
Several measures have been employed for evaluating benefits and costs over time. Although each
has its strengths and weaknesses, the best general measure appears to be the net discounted present
value of an investment.* This is the measure to be employed for the purpose of illustrating how dis
counting and system life may be handled for multiplemission systems. First, let us consider a single
mission system. Its net discounted present value (V) is:
(12) V=2,idi(pqoC ),
1
where di is the discount rate in year i, or (1 + r) f where r is the rate of interest in use for military
systems and is assumed, for the sake of simplicity, to be constant for all relevant years, t The other
variables are as described above. In words, then, V is the sum of a stream of net benefits, each year's
entry being discounted by a greater amount than that of the year before.**
If we extend this to the case of two multiplemission systems where the investment costs are com
mon to both systems and have identical useful lives and are subject to the same interest rates, we have
(18) V n = 2,idi (piqo + P2<?o — Cf(q ) + w x q + w>q„).
The first order condition for the maximization of the net discounted present value of the two missions
performed by the system is
dq
This implies, following Boiteux [2, appendix],
(19) Xidi (p, + p 2 ) = Xidt {2x + wi + w 2 ).
Since both sides may be divided by 1*idu we see that (19) reduces to
Pi + Pi = 2x + w 1 + wi.
*For a comparison of the more prominent measures, see Baumol [1. ch. 19].
tThe interest rate is presumably based on some notion of social time preference or opportunity cost. For a discussion of
some of the issues involved see (Prest and Turvey, [7, pp. 697700].
**The link from "benefits" to expenditure (pq„) in our context is that pq a is the expenditure necessary to meet a particular
requirement, and it is the meeting of the requirement that is the benefit.
446 R t. crow
Thus, allocation in this case follows the same lines as (13), (16), etc.; and the interest rate and useful
life plays no role.
If these are different useful lives of the system in different missions, and/or the interest rate differs,
a general solution of the allocation problem has not been found, although solutions have been found
for specific cost functions, e.g., linear functions. The problem arises from the translation of units of
output of particular missions to units of the multimission systems. The simplicity of the costallocation
scheme proposed in this paper is due to Boiteux's demonstration that this can be done with great
generality when time does not enter the picture. This breaks down, however, where the missions have
different useful lives while performed by the multiplemission system in question or have different
interest rates, since it is not generally possible to translate them from specific missions to units of the
multiplemission system. It appears that the basic approach retains its validity, but not its simplicity.
CONCLUSION
This paper has presented a new technique for allocating the common costs of multiplemission
systems. One major departure from existing practice is that the basis for allocation is to be found
in the importance of the missions as reflected in their relative prices or, more generally, on an assess
ment of the relative abilities of a system to carry out alternative missions. The second major departure
is that it uses marginal conditions rather than proportional or "eitheror" allocations. Thus, unlike
existing techniques, it is consistent with the principles of efficient resource allocation.
These are not as drastic departures from current practice as they may seem, since some notion of
relative importance is implicit in the distinction of major from minor missions in the currently employed
twoship method of allocation. The twoship method also adheres to a rough approximation to marginal
principles. In a very real sense, what has been presented above can be regarded as a generalization of
the twoship method, as well as an explicit statement of the principles underlying it.
The major problem in employing the proposed method is, of course, to develop means of measuring
the relative prices of different missions supplied by a system. Several tentative suggestions have been
offered which, crude as they are, should aid in achieving better allocation of common costs. In all like
lihood better means can and will be devised through experience if the proposed approach is employed.
ACKNOWLEDGME NT
The author is grateful for discussions and correspondence with A. S. Rhode, J. T. Kammerer, K. F.
Linder, Saul Gass, George Taylor, Kenneth Babka and William Baumol. I wish to thank them but also
absolve them of any blame for whatever errors or misconceptions remain.
REFERENCES
[1] Baumol, W. J., Economic Theory and Operations Analysis (Englewood Cliffs, N.J., PrenticeHall,
1965), (2nd ed.).
[2] Boiteux, M., "Peak Load Pricing" (J. R. Nelson, ed), Marginal Cost Pricing in Practice (Prentice
Hall, Englewood Cliffs, N.J., 1964), chap. 4, pp. 5990.
[3] Carlson, S., A Study on the Pure Theory of Production (Kelley and Millman, New York, 1956).
[4] Crow, R. T., "The Allocation of Common Costs of MultipleMission Systems," a report to Systems
Analysis, Chief of Naval Operations, Contract No. N0001470C0086 by MATHEMATICA,
Inc., Bethesda, Md. (Nov. 1971).
COST OF MULTIMISSION SYSTEMS 447.
[5] Grey, J. C, Cost Analysis Methodology (Fire Support Study Working Paper No. 9), Dahlgran, Va.:
U.S. Naval Weapons Laboratory (July 1970).
[6] Littlechild, S. C, "Marginal Cost Pricing with Joint Costs," Economic Journal LXXX, 323335
(June 1970).
[7] Prest, A. R. and R. Turvey, "CostBenefit Analysis: A Survey," Economic Journal LXXV, 683735
(Dec. 1965).
AN EXPLICIT GENERAL SOLUTION IN LINEAR FRACTIONAL
PROGRAMMING*
A. Charnes
Center for Cybernetic Studies
University of Texas
W. W. Cooper
School of Urban and Public Affairs
CarnegieMellon University
ABSTRACT
A complete analysis and explicit solution is presented for the problem of linear frac
tional programming with interval programming constraints whose matrix is of full row rank.
The analysis proceeds by simple transformation to canonical form, exploitation of the
FarkasMinkowski lemma and the duality relationships which emerge from the Charnes
Cooper linear programming equivalent for general linear fractional programming. The
formulations as well as the proofs and the transformations provided by our general linear
fractional programming theory are here employed to provide a substantial simplification for
this class of cases. The augmentation developing the explicit solution is presented, for
clarity, in an algorithmic format.
I. INTRODUCTION
The linear fractional programming problem arises in many contexts with relatively simple con
straint sets, e.g., in the reduction of integer programs to knapsack problems, in attrition games, and in
Markovian replacement problems as Well as in NeymanPearson rejection region selection problems.
Illustrative examples are provided by G. Bradley [5] or F. Glover and R. E. Woolsey [12] t, J. Isbell
and W. Marlow [13], C. Derman [10], and M. Klein [16].
The linear fractional programming problem in all generality, and with all singular cases considered,
was reduced in [8] to at most a pair of ordinary linear programming problems. This immediately made
available all of the algorithms, interpretations, etc., that are associated with linear programming.
This includes, we should note, access to any ordered field,** and any of the algorithms and the com
puter codes for linear programming problems which, by virtue of [8], thereby also become available
for any problem in linear fractional form. Thus, with the development in [8], the work in linear frac
tional programming took a different form from its previous sole concern with the development of special
types of algorithms for dealing with this kind of problem.
This research was partly supported by a grant from the Farah Foundation and by ONR Contracts N0001467A01260008
and N0001467A01260009 with the Center for Cybernetic Studies, The University of Texas. This report was also prepared
as part of the activities of the Management Sciences Research Group at CarnegieMellon University under Contract N00014
67 A03 140007 NR 047048 with the U.S. Office of Naval Research. Reproduction in whole or in part is permitted for any pur
pose of the U.S. Government.
t See also E. Balas and M. Padberg [1].
** See, e.g., the development of the opposite sign theorem and related developments in [7].
449
450
A. CHARNES AND W. W. COOPER
In the present paper, we apply our reduction, as given in [8], to a general class of linear frac
tional problems, viz., those for which the constraint set is given by
(1.1)
a s= Ax =£ b
so that this part of the model is in "interval programming" form.* Here we shall assume that the
matrix A is full row rank and the vectors a, b, and x meet the usual conditions for conformance. This
means that the constraint set is a parallelopiped. See the Final Appendix in [7].
Subject to conditions (1.1), we wish to
(1.2)
N(x) c T x + Co
maximize R(x) = ^ , . = t= — ; — r ^ constant,
D(x) d'x + do
so that we are now concerned with a problem of linear fractional programming. Because A is of full row
rank it has a right inverse, A*, and hence we can write
(1.3)
Now, setting
or
where
(1.4)
and
we obtain
subject to
(1.5)
AA* = I.
y—Ax
x = A*y+Pz,
P = IA*A
z is arbitrary,
_ c T A*y + c T Pz + c
max R (y ' z) = My + dTPz + dl
a^y =£ b
in place of (1.1) and (1.2).
Because z is arbitrary, ** unless
(1.6)
c T P=d T P =
*See[2H4]and[18][19].
**Observe that we have ruled out the case in which R is identically constant in (1.2).
LINEAR FRACTIONAL PROGRAMMING 451
we shall obtain max R = °°. In order to avoid repetitious arguments, however, we defer the proof of
this until we have discussed the situation c T P = d T P = 0. See section IV, below.
Waiving this consideration, we shall next proceed to solve this problem explicitly and in all gener
ality by means of the following three characterizations: The denominator D(x) is (1) bisignant, (2)
unisignant and nonvanishing, or (3) unisignant and vanishing on the constraint set. In (1), i.e., the bisig
nant case, we shall show that R(x) = °°. Furthermore, we shall show how to identify this case at the
outset so that it may be discarded from further consideration. This will leave us with only cases (2)
and (3) to examine where we shall proceed to transformations from which a onepass numerical com
parison of coefficients makes explicit the optimal value and solution.
After this has all been done, we shall then return to assumption (1.6) in a way that utilizes the pre
ceding developments. Finally we shall supply numerical examples to illustrate some of these situations
and then we shall draw some conclusions for further research which will return to the remarks at the
opening of this section.
II. BISIGNANT DENOMINATORS
Employing assumption (1.6), our problem is
c TA*y + c
(2.1) maxR{y)= dTA*y + d Q '
subject to
in place of (1.5). Note, however, that here, and in the following, we shall slightly abuse notation by con
tinuing to use the symbols R, N, D, as in (1.1) and (1.2) even though we mean the transformed function,
as in (2.1).
V A
Let D, D denote the maximum and minimum, respectively, of D over the constraint set. We note:
LEMMA 1:
(a) D is bisignant if and only if D> and D< 0.
(b) D is unisignant if and only if either D =£ or D 2* 0.
In terms of y, since we can choose each component of y independently (see (1.5)), we can express D
A
and D immediately as
D = X d'jbj + X <*'&+ do  max D(y)
(2.2) a +
D = 2 d'jaj + X d'jbj + do = min D(y) ,
+
where d) = (d T A)j is the jth element of d T A* and "+" or "— " indicates that the summation is on only
the positive or negative dj.
Let us first consider the bisignant situation. If we make the transformation of variables
yj aj= kj£j, dj 2*
(2.3)
bjyj = kjCj, dj < 0,
452 A CHARNES AND W. W. COOPER
where the kj > will be suitably chosen, the constraints transform to
< kjCj ^ bj  aj
(2.4) or
0^^ = ^=^.
Without loss of generality, the 8j are positive, since otherwise £j = and does not enter into the optimiza
tion.
By choosing the kj suitably, we obtain the form
^yjCj + yo
(2.5) * (c) = sc, + 2&i
+
Note that 2^8j + 2j8j> 1, since otherwise D would not be bisignant.
+
One of the following two cases must now hold:
CASE (i): for some £, =£ £ s= g, such that
D(l) = Owe have N(l) * 0,
or else
CASE (ii): for every £, «= £ « 8, such that
D(0 = Owe have N(Q=0.
In Case (i), since N(£) is continuous, there is a neighborhood of £ in which /V(£) is unisignant.
Since
(2.6) ^l J + ^l J =K^8 j +^8 } ,
+ +
and *£ £j *£ 8j in the constraint set, we can choose €j 2 s 0, V €j > 0, so that =£ £ ± e ^ 8, sgn jV(£
J
+ e) = sgn N(£ — e) and /)(£ + e) > >/)(£ — e). By approaching £ along the line segment from one of
£ + €,£ — €, we can make/? (£)  * °°.
In Case (ii), we must have yj = for all j such that d j = 0. For D((,) =0 involves specifying only
the Cj f° r dj > and dj < 0. If yj # for dj — 0, then having made #(£) =0, we can change the
value of /V(£) by changing £ J() . Thus /?(£) =0 would not imply /V(£) =0. We therefore drop the "+",
" — " notation is considering Case (ii) and rewrite it as V VjCi + To = whenever ^ £j =£ 8j and
SCi=i.
By letting y,= £jy , Jo 3* 0, this becomes ±j V 7/30 + yoyo) > 0, whenever
5>j yo = o
LINEAR FRACTIONAL PROGRAMMING 453
(2.7) yj + Sjyo^O
yj 3*0
y„2=0.
Note that the implication extends to yo =0 since yo = implies all yj = 0.
We now apply the FarkasMinkowski Lemma* to the pair of implications in (2.7) and obtain
(2  8) yo=^+2 8 ^j + l/ o; 0$.">n^°.
^ j
for the first implication, viz., I ^ 7jyj+ yoyo I 2*0. For the second one viz., (^ Tjyj+yoyo) 2=0, we
j J
obtain,
f %=/*"■ flr+jy
(29)
( — yo = — /a" + ^) 8,0/ + i>6"; 0j", fj~, i>6~ 2* 0.
j
Adding the first expressions in (2.8) and (2.9),
(2.10) or
Adding the second pair
=  (fi+ + fi~) + £ 8j(0+ + dy) {vS + v~ )
j
(2.11) or
/ Lt+ + / L t =2o J (0; + 0j) + *++»v
i
Since each term on the right is nonnegative, we have
(2.12) fjL+ + tJL^0.
Next, substituting from (2.10) into (2.11), we get
o= (fi++fi) + £ 8,0*++/*) + £ 8»(»?+»r)+ >tf +"o
(2.13) or
o=(^8 j i) ( M + + M ) + 2v; + 2 Vj + ^ + iv
Since the righthand side is a sum of nonnegative terms, each of these must be zero. Moreover,
/Lt+ + /x = since^8 j l>0
*See Appendix C in [7].
454 A  CHARNES AND W. W. COOPER
(2.14) p+ = vj = since 8j>0
By virtue of (2.14), and going back to (2.10),
(2.15) 0+ + 0j = 0.
Further, with 0t, $j 2* 0, we must have Of = 6t = for ally. Therefore, yj = /x + , for ally, and y — — fi+,
so that we have
^yjCj + yo ^ + (2^i)
(2.16) R(0 = J =
= fx + = constant.
In other words, Case (ii) can only occur in the trivial instance where the numerator is a constant multiple
of the denominator. In this case, each coefficient in the numerator is the same multiple of the corre
sponding coefficient in the denominator, and this would have to be true in the original N(x), D(x)
description and hence obvious upon comparing the initial coefficients. Since we have ruled out this
very obvious case (see (1.2)), we have only max R(x) = °° when D(x) is bisignant on the constraint set.
III. UNISIGNANT DENOMINATORS
V
The unisignant cases now remain to be considered. If D =£ we multiply both W and D by —1,
(thus not altering the value of R) and we are then reduced to "D" 5* 0. With this normalization, we
make a transformation of variables as in (2.3),
yjaj = gj£j, dj^O
(3.1)
bjyj = gj^j, dj<0,
where the gj > will be suitably chosen. The constraint set will now be
bj — aj
(3.2) 0<6*£8,=
Si
and, first considering the case where D > 0, the gj can be chosen so that the problem is
(3.3) maxR(0= j '' , 0^&*£6>
LINEAR FRACTIONAL PROGRAMMING 455
In (3.3) the summation is only over "+" and "— " because, the denominator being positive, optimal
values for the £, such that dj — can be specified as £/ = when jj < 0; gj= 8, when jj 2* 0, and these
new constant terms are assumed to be already contained in y . By the reduction that we gave in [8],
however, the equivalent linear programming problem is
max ^yjVj+yo'no,
j
subject to
£ TJJ+ 7)0=1
(3.4)
7,^0,
where we can also note that these constraints imply rjo > 0.
The dual to (3.4) is
mm u
subject to
(3.5)
U + (t)j 3= Jj
u^8ja>j=y
j
We shall employ this dual in an essential manner to obtain our desired onepass argument for obtaining
an optimum. At each step in the procedure, we shall have a solution to a less restrictive problem than
the dual problem and an associated primal feasible solution.
Suppose the y's are renumbered so that y\ 3* y% 3* . . . y n ;
Then Case (i) yo 3= Ji has the immediately obvious primal solution tj* = 1, 17* = 0, and max /?(£) = yo.
In the contrary case, Case ii,
yi > •  . 2 s y P > yo 2* y P+1 3= . . . 3= y„,
we build up an algorithm based on the dual problem in which we choose u q at the gth step to satisfy
u« + o>J=yj, 7=1, • • • , q
(3.6)
u«%8.<o«=y .
Using the first q equations to obtain w? in terms of « 9 and substituting in the last equation, we obtain
456 A CHARNES AND W. W. COOPER
(3.7) "*(i+is*)= IrA + yo
and hence
(3.8) u«= >
1 + lSj
i
Thus, u q is a convex combination of yo, J\, . . ,y q with proportionahty constants 1, Si, . . ., 8 q .
Note that if u Q *£ y 9 , then u q , w9, j= 1, . . ., q, satisfy the first q constraints plus the "yo" constraint
of the dual problem; hence satisfy a less restrictive problem than the dual.
If we take
n. f =i / ( i+ i 8 ^)
(3.9)
and
(3.10) v]=8j J (l+£sj), j=l, ■ • ., q
i7?= 0, j>q,
then t/9, y = 0, . . ., n is a feasible solution to the primal problem and y T r) Q + yor) q = u q (by substitution
in (3.4) and comparison with (3.8)).
Hence, whenever we can get u Q , o>? feasible for the dual problem, we will have a primal feasible
solution for tj q with the same functional value and thus we will have an optimal pair of dual solutions.
This, plus the equivalences maintained via (3.1) and our theory from [8], thus justifies the develop
ment that we detail as follows:
To start,
(3.11) u 1 =(y +y 1 8 l )l(l + 8 1 ).
(Note, u 1 > yo since yi > yo and u' is a proper convex combination of yi, yo.)
u 1 ?^?
We check:
Is
If yes:
we are
done:
coj = <u* 0, j > 1
and
If No:
then
LINEAR FRACTIONAL PROGRAMMING 457
^=1/1 + 8,
17» = 7)* = 81/1 + 6 ,
i7] = T,* = o,y>i.
"' < 72,
yo + y,8, + y 2 8 2
1 + 8, + 8 2
/yo + y.8i \ / 1 + 8. \ / 8 2 \
V 1 + 8, Al + 8 1 + 8./^\l + 8, + 8 2 / r2
,/ 1 + 8, \ / 8 2 \
= "ll + 8, + 8 2 j + ll + 8 1 + 8 2 j y2<y2 '
since u 1 < y 2 .
Next, Is u 2 5* y 3 ?
If yes: We are done with the substitutions indicated by (3.6).
If No:
u 2 < y 3 and we continue to « 3 .
This process must stop by u p at the latest since
7o+2)yj8j
> y ^y P +i 3= • • • ^y«
1 + 1)8;
1
Thus max R(i) = u s where 5 is the least positive integer such that u s 3= y g+l , and
s
yo+2 yj fi j
i + S«i
This concludes the case D > 0.
A. CHARNES AND W. W. COOPER
ng case has D = 0. By making a transformation of variables as in (3.1), and choosing
the gj > suitably, we obtain
i
We may dispose of two situations immediately
(i) y > 0: then /*(£)»<» as & > 0.
(ii) To— 0: here the dual problems are
max V jjqj min a
i
n
with V tjj = 1 with u 4 ojj 5= y,
i
77,8,1)0 =s0 ^Sjtoj^O
i
Tjj, rjo ^ ajj 2= 0.
tion pair is 0).*= 0, tt= yi = maxy,, and 7j*=l, tj* = 0, _/ > 2. The maximum of /?(£) is
thus yi.
The remaining instance is
(iii) y> < : y, s* • ■ • 5* y„ 3= y 2= y P +i 3* ■ ■ 3* y„.
The dual linear programs can now be written
1
max ^ y/nj + y i?o min u
j
with V t/j = 1 with u + to, S 5 y,
j
j
7), S= W, ^ 0.
As before, we define
LINEAR FRACTIONAL PROGRAMMING 459
This yields
It may be easily verified that
«'=(iyA+*)/i«*.
„,= *».) (.L_! )+(A. )y q ,
j 1
or, what is the same thing, vfl is a proper convex combination of u ( « _1) and u". Thus, if u r <y r+u then
u r < u r+1 < y r+ i as in our earlier argument for Z) > 0.
The steps of our process are as before: if u q 5* y q +i we are done; otherwise, we test « 9+1 against
y</+2. At worst we are done with
■— (2^+u»)/i%.
As before, u*= max R(£), where s is the least positive integer such that u 8 2* y»+i.
IV. RCr, *)
Returning to (1.5) we consider the remaining cases in which either
(a) dTP = 0, c T P¥=0
(4) or
(b) <FP * 0.
In case (a), since z is arbitrary we can make ± c T Pz — * °°, hence max R(y, z) — » °°. In case (b), we
are in the bisignant denominator situation since we can make d T Pz * ± 00. The argument of the
bisignant section of this paper (with the additional variables, z) now shows that max R = °° since we
have ruled out, a priori, the case in which R = constant.
V. EXAMPLES
Some examples may help to fix and sharpen some of the preceding developments. Thus consider
_. 3*i — *3 + 4
max R(x) —
with
1 ^x l + x 3 ^2
(5.1)
1 *£ x 2 *£ 5,
460 A CHARNES AND W. W. COOPER
and the variables Xu x 2 , x 3 are otherwise unrestricted.
Here we have,*
(5.2)
A =
1 1
1
9
A* =
1
1
P =
(IA*A)
0
r
To exhibit the development in full detail we next write
c r A*y= (3,0,1)
1
1
(£)<»• •>(£)*
c =4
(5.3)
c T Pz= (3, 0,1)
1
c T A*y= (0,2,0)
1
0]
1
z,\=(0,0,3) lz x
2 2 J I 22
Ml \ Z 3
(£)<»■»>■(£)*
= — 323
d'Pz = (0,2,0)
1
2, \= (0,0,0,) /z,
22 I I 2 2
Ml \*8
=
do=0.
Evidently in this case d T P — 0, as witness the next to the last expression. On the other hand, c T P ^ 0,
as witness c T Pz — — 3z 3 in the second expression for (5.3). Hence condition (a) of the preceding section
obtains and we have R —> °° even though
(5.4)
 1 s£ y, s£ 2
1 =S y 2 =£ 5.
This occurs because z is arbitrary and can be freely chosen in
(5.5)
max R(y, z)
c T A* y.+ c r Pz + Co _ 3y,  3 Z3 + 4
d T A#y+d T Pz +
2y,
l !t may be observed that il" l*'!i«ii'< is no! unique.
LINEAR FRACTIONAL PROGRAMMING
461
which is the specialization of (1.5) to this case. Of course, the result R — > °° in (5.1) can be confirmed by
direct inspection, since negative values of z 3 may be selected along with increasingly positive values
of %\ as required in order to maintain the first interval programming constraint.
This last remark suggests that an adjunction such as
(5.6)
«£ x 3 *£ 1
will convert (5.1) to a problem with a finite maximum. This yields an A for an interval programming
format with the full row rank condition fulfilled as in
(5.7)
A =
1 o l
1
1
1
, A# =
1
1
0
1
On the other hand, A is also of full column rank so that we also have A* =A~ 1 and
P = IA#A = 0.
Hence both of the conditions specified in (1.6) are fulfilled, viz.
c rp = d rp =
for any c T and d T .
The problem to be solved is now written
max R(y)
_ c T A* y+ Co _ yi — 4y 3 + 4
(5.8) with
d T A*y+d
l^yi
1 *£ y 2 s£ 5
^ y 3 *£ 1.
2y,
Evidently the solution to this problem is y* = 2, y* = 0, and y* = 1 so that
max R (y) = — = 5.
To obtain the corresponding components of x we simply utilize (1.4) with P = to obtain
(5.9)
Xl \
1
1"
/ 2 \
( 2
x 2
\ = A*y =
1
v
x 3 J
.0
1j
Vo/
Vo
462 A  CHARNES AND W. W. COOPER
As may be seen, these x values satisfy (5.1) with (5.6) adjoined. They are evidently also maximal with
R(x) = 5 since x 3 can no longer be negative and x 2 and x\ are at their lower and upper limits,
respectively.
In some cases the solutions, as above, may be obvious but, of course, this cannot always be ex
pected. Recourse to the preceding development, however, will produce the wanted results in any
case, however, as we illustrate by now developing the above examples, along with the related back
ground materials, in some detail as follows:
Because the denominator is unisignant we utilize section III. Observing that dj = d 2 = 2 in the
denominator and hence is nonnegative, we have recourse only to the first part of (3.1) in order to write
y\ — «i = gi£i
6.1) y 2 — a 2 = g 2 t; 2
J3 _ a 3 = ^3^3,
where, respectively, ai = — 1, a 2 = 1, and a 3 = 0, via (5.8). The development from (3.1) to (3.2) applies
to this case as,
with
ft ft
ft g2
0^3^ S 3 — —  ^
The insertion of (6.1) into the functional then produces
(6.3) 3(ftgi + ai)4(ft& + a 3 )+4 = 3g,g,  4gfr + 1
2fa6 + at) 2(ft6 + D
via (5.8). Choosing
(6.4) ft =1/2, «=1, ft =1/2
and setting
(65) y. = 3/2, y 2 = 0, y 3 =4/2, y =l/2
LINEAR FRACTIONAL PROGRAMMING 463
gives the denominator form wanted for (3.3) as:
2?i 9 2
max/?(£) = + t
with
(6.6) s= £ 2 =£ 4
Q =?£&*£ 2.
The transformation £j = t7j/t}o from our previously developed theorem [8] then produces the following
example for (3.4):
3 Vo
max  i?i + Ot7 2 — 2 173 + —
with
(6.7)
171 + 172 + 173 + 170=1
171 —6170^0
172 — 4i7o s S
1732170^0
171, i? 2 , 173 2*0,
where the gj values of (6.4) combine with (6.2) to give 81 = 6, §2 = 4 and 83 = 2, as required for the
application of (3.4). The corresponding dual, which our previous theory also gives access to, is
min u
with
3
11 + (Oi 2* 
(6.8)
U + 0>2 2*0
U + G>3 3 s  2
u — 6o>i — 4o>2 — 2co3 = —
(ti\, Ct>2, CO3 2 s 0.
464 A  CHARNES AND W. W. COOPER
This, of course, is the application of (3.5) to the present example.
Since yi = 3/2 exceeds yo == 1/2 we are in the situation of case (ii) following (3.5). Thus, preserving
our subscript identifications from (6.5), we have
(6.9) y, >y 2* 72 2* 73
in our present situation. We therefore see that the first application of the suggested algorithm should
A
suffice. (See the remarks which conclude the case D > in section III.)
Applying (3.11) now produces
(6.10) u l = (1/2 + 3/2 6)/(l + 6) = 19/14>yo=l/2.
Evidently this also formally satisfies the condition that u l equals or exceeds the immediate successor
of yi is (6.9). Hence, we have
u l = u*= 19/14
(6.11 ) u>\ = <o* = y i u* = 3/2  19/14 = 2/14
(li\ = (O* =
(ti\ = &>* = ,
which satisfy the constraints of (6.8), as may be verified, with min u= u* — 19/14.
Moving to the primal problem via (3.9),
(6.1)
y\l =
*?} =
l l l
1 + 8, 1 + 6 7
6. 6 6
1 + 8, ~ 1 + 6 ~ 7
and all other tj] = 0. See (3.10).
Inserting these values for the corresponding t/j in (6.7), we see that all constraints are satisfied with
3/2 tj, + tjo/2 = 3/2 • 6/7 + 1/7 • 1/2 = 19/14,
the same as the value of u*, thereby confirming optimality. In fact, as our theory [8] prescribes, we
need merely apply the expressions
7,0= 1/7, ^ = 7,,/tjo = 6/7 + 1/7 = 6
LINEAR FRACTIONAL PROGRAMMING 465
with all other t7j = and then reverse the development from (6.6) to (6.7) in order to verify that this
value is also optimal for
with
(6.13)
vt t\ 3/26+1 /2 .....
max R (£) = = 19/14
0*££> = 0^ — =4
gl
0=s £, = ()< = 2.
ga
Evidently we can now directly effect substitutions in (6.1) and obtain
Yi = git;i + ai = 3 — 1 = 2
(6.14) j&=&6 + a,=0+l = l
y 3 = g 3 &+ 03=0+0=0.
Then we can proceed exactly as in (5.9) to obtain the values *i = 2, Xz=l, x^ — O which we previously
observed to be optimal.
SUMMARY
Although the development in this paper proceeded, for clarity, by algorithmic format, we sum
marize below the explicit solution in tabular format for direct theoretical interpretation and utilization.
Summary of Solution
Denominator
Transformed
numerator
R*
Bisignant
Unisignant:
(a) positive
(b) nonnegative
•y„>0
yo<0
00
f
yo+2y*8*
u' — , least 5 with u* g y„ + 1
1
u* = , least with u" g >»+ 1
466 A CHARNES AND W. W. COOPER
CONCLUSION
Before proceeding any further we should probably point up, again, the crucial role played by the
general theory (including the transformations and proof procedures) which we introduced in [8] for
making explicit contacts between linear fractional and ordinary linear programming, in all generality
and exact detail. These transformations are also utilized in the present paper and the theory is also
extended via the duality (and other) characterizations given in the preceding text. These are joined
together here for the proofs in algorithmic format kind we have just illustrated by example and com
mentary. Other uses can undoubtedly also be made of this theory and the preceding extensions via the
passage (up and back) between linear fractional and ordinary linear programming that is now possible.
Our general theory has been used by others, too, to extend or simplify parts of linear fractional
programming en route to effecting the contacts with ordinary linear programming that are thereby
obtained. The work of Zionts [22] should perhaps be singled out as being most immediately in line
with the R = °° results presented in this paper. Zionts' development is directed only toward simplifying
matters by focusing on eliminating cases for linear fractional programming which are either deemed
to be unwanted or of little interest for practical applications.
The developments cease as soon as the contacts with ordinary linear programming are identified
via our theory, which he like others utilizes for this purpose.
We have effected the developments in this paper in a way that makes contact with interval linear
programming.* An opening for further twoway flows is thereby also provided. The resulting junctures
should also help to guide subsequent developments in the more special situations that now seem
to invite consideration in the future. Finally, the possibilities for dealing with specially structured
problems (such as those observed at the start of the present paper) should also be observed explicitly in
this conclusion, partly because the theory we have now developed and presented should also be a helpful
guide to these additional cases which are important in their own right. Thus we can now conclude
here by referring to our opening remarks.
ACKNOWLEDGMENT
We wish to thank W. Szwarc of the University of Wisconsin for comments which helped us to
improve the exposition in the manuscript for this article.
BIBLIOGRAPHY
[1] Balas, E. and M. Padberg, "Equivalent KnapsackType Formulations of Bounded Integer Pro
grams," CarnegieMellon University (Sept. 1970).
[2] BenIsrael, A. and A. Charnes, "An Explicit Solution of a Special Class of Linear Programming
Problems," Operations Research 16, 11661175 (1968).
[3] BenIsrael, A., A. Charnes, and P. D. Robers, "On Generalized Inverses and Interval Linear
Programming," Proceedings of The Symposium on Theory and Applications of Generalized
Inverses, held at Texas Technological College, Lubbock, Tex. (Mar. 1968).
[4] BenIsrael, A. and P. D. Robers, "A Decomposition Method for Interval Linear Programming,"
Management Science Vol. 16, No. 5 (Jan. 1970).
[5] Bradley, G., "Transformation of Integer Programs to Knapsack Problems," Yale University,
Rept. No. 37 (1970). To appear in Discrete Mathematics.
•See [2], [3], [4].
LINEAR FRACTIONAL PROGRAMMING
467
[6] Chadda, S. S., "A Decomposition Principle for Fractional Programming," Opsearch 4, 123132
(1967).
[7] Charnes, A. and W. W. Cooper, Management Models and Industrial Applications of Linear
Programming (John Wiley & Sons, Inc., New York, 1961).
[8] Charnes, A. and W. W. Cooper, "Programming with Linear Fractional Functionals," Nav. Res.
Log. Quart. 9, 181186 (Sept. Dec, 1962).
[9] Dorn, W. S., "Linear Fractional Programming," IBM Research Report RC830 (Nov. 27, 1962).
[10] Derman, C, "On Sequential Decisions and Markov Chains," Management Science, 9, 1624 (1962).
[11] Gilmore, P. C. and R. E. Gomory, "A Linear Programming Approach to the Cutting Stock Problem,"
Operations Research 1 1, 863888 (1963).
[12] Glover, F. and R. E. Woolsey, "Aggregating Diophantine Equations," University of Colorado
Report 704 (Oct. 1970).
[13] Isbell, J. R. and W. H. Marlow, "Attrition Games," Nav. Res. Log. Quart. 3, 7193 (1956).
[14] Jagannathan, R., "On Some Properties of Programming in Parametric Form Pertaining to Frac
tional Programming," Management Science, 12, 609615 (1966).
[15] Joksch, H. C, "Programming with Fractional Linear Objective Function," Nav. Res. Log. Quart.
77,197204(1964).
[16] Klein, M., "InspectionMaintenanceReplacement Schedules under Markovian Deterioration,"
Management Science 9, 2532 (1962).
[17] Marios, B., "Hyperbolic Programming," translated by A and V. Whinston, Nav. Res. Log. Quart.
77,135155(1964).
[18] Robers, P. D., "Interval Linear Programming," Ph.D. Thesis submitted to Northwestern Uni
versity (Evanston, 111.) Dept. of Industrial Engineering and Management Sciences (1968).
[19] Robers, P. D. and A. BenIsrael, "A Suboptimization Method for Interval Linear Programming,"
Systems Research Memo No. 206, Northwestern University (Evanston, 111.), The Technological
Institute (June 1968).
[20] Swarup, K. "Linear Fractional Functionals Programming," Operations Research 13, 10291036
(1965).
[21] Wagner, H. M. and J. S. C. Yuan, "Algorithmic Equivalence in Linear Fractional Programming,"
Management Science 14, 301306 (Jan. 1968).
[22] Zionts, S., "Programming with Linear Fractional Functionals," Nav. Res. Log. Quart. 75, 449452
(Sept. 1968).
USING DECOMPOSITION IN INTEGER PROGRAMMING
Linus Schrage
Graduate School of Business
University of Chicago
ABSTRACT
When implicit enumeration algorithms are used for solving integer programs, a form
of primal decomposition can be used to reduce the number of solutions which must be im
plicitly examined. If the problem has the proper structure, then under the proper decomposi
tion a different enumeration tree can be defined for which the number of solutions which
must be implicitly examined increases with a power of the number of variables rather then
exponentially. The proper structure for this kind of decomposition is that the southwest
and northeast corners of the constraint matrix be zero or equivalently that the matrix be
decomposable except for linking columns. Many real traveling salesmen, plant location.
production scheduling, and covering problems have this structure.
INTRODUCTION
Consider an integer program for which the constraint matrix has the form shown in Figure 1. The
important feature is that the columns of A% link two otherwise independent subproblems.
In general, a set of columns is a Unking set if deleting that set partitions all remaining columns
A,
Az
A,
Figure 1.
into two disjoint, nonempty sets, A \ andA 3 , such that there do not exist two columns, one in^i, and one
in A 3 , both having a nonzero entry in the same row.
For simplicity, assume that all variables are required to be either or 1, and that there are n
variables in each of the blocks A U A 2 , and A 3 . There are then 2 3 " solutions to be implicitly examined.
FORM OF THE DECOMPOSITION
We assume that an enumerative scheme similar to that described in Geoffrion [4] is to be used.
We can think of A 2 as being the master problem. If we fix the values of the variables in block A 2 to
some set of values, then the variables in blocks A x and A 3 constitute independent subproblems and
can be solved separately. Solving one of these problems (Ai or A 3 ) requires us to implicitly examine
2" solutions. Each of these two problems must be solved for each possible setting of the variables in
At. Therefore, we must effectively enumerate 2"(2" + 2") =2 2n+1 solutions. If we define a node to be
the setting of some variable to one of its two values, then perhaps a better measure of computational
469
470
L. SCHRAGE
difficulty is the number of nodes in the enumeration tree. The nondecomposition complete enumeration
tree has approximately 2 3 " +1 nodes in it. By use of decomposition the complete enumeration tree has
2n(2»+i + 2»+i)+2" +1 = 2 2n+2 + 2" +1 nodes. In this sense, decomposition reduces the difficulty of
the problem by a factor of approximately 2" _1 
EXISTENCE OF THE DECOMPOSITION STRUCTURE
One can argue that this stairstep structure exists in several classes of integer programs. In a travel
ing salesman problem based on the United States, the variables in Az might correspond to the choice
of arcs connecting cities in the Midwest. For each set of these arcs chosen, one should be able to com
plete the eastern and western legs of the tour independently.
Similar arguments can be given for the plant location problem based on cities in the United States.
The variables in Ai would correspond to the decisions of which plants to build in the midwest. For each
set of midwest plant decisions one would expect to be able to solve the eastern and western plant loca
tion problems independently.
The WagnerWhitin [10] dynamic lotsize problem is a special case of the plant location problem
which has an even more obvious decomposition structure. Once we decide to produce in a particular
period, the production plans for subsequent and for previous periods can be solved independently. By
discarding variables which obviously cannot be in the optimal solution, the 12period problem given in
Wagner and Whitin [10] can be decomposed into a stairstep structure with seven blocks.
Another class of integer programs are covering problems. One situation giving rise to covering
problems is in the assignment of vehicles to routes. Each city in the service area must be "covered"
by at least one route. Arguments similar to those for the traveling salesman and plant location problems
can then be made.
GENERALIZATION TO MORE THAN THREE BLOCKS
Consider a constraint matrix with the stairstep structure shown in Figure 2. Again, for simplicity,
we assume the problem is composed only of 01 variables.
1 1
A 2
1 A 3
A 4
A„i
I A K
Figure 2.
This problem can be decomposed into a hierarchy of masters and subproblems. Assume there is
a total oik blocks. Choose as the highest level master, block number [k/2]+ 1, where [x] is the greatest
DECOMPOSITION IN INTEGER PROGRAMMING
integer no larger than x. This divides the problem into two independent subproblems with [A/2
[(k — l)/2] blocks each. These two subproblems can themselves be decomposed in similar
If this decomposition is carried on in this recursive fashion we will then have approxinu
levels of decomposition. In doing the enumeration we will first set the variables in block [A/!
For each setting of these variables we must solve the independent subproblems composed o
1 to [k/2] and blocks [A/2] +2 to k. The first of these subproblems is solved by first setting the variables
in block [[A/2]/2] + 1. This divides the problem further into independent subproblems. This enumera
tion method is applied recursively to each independent subproblem created. The result is a binary tre<
of independent subproblems.
For simplicity assume that each block contains n 01 variables. Let 5 (jf) be the number <
nodes in a decomposition enumeration tree with j levels. If we increase the number of blocks such that
the number of levels of decomposition increases iromj toy+ 1, then each of the 2" solutions to
est level master partitions the remainder of the problem into two^ level problems. We then have that
s(j + 1)=2" • 2 • s(j), where s(l) =2". Thus, s(j)= (2" +1 ) j/2 and the number of terminal nodes
decomposition tree of a problem with k blocks is (1/2) (2" +1 ) log2fc = (1/2)A" +1 . A perhaps more accurate
measure of the size of the enumeration tree is the total number of nodes, terminal and intermedial*
in the tree. Let t(j) be the total number of nodes in the decomposition enumeration tree withy level
of decomposition. The total number of nodes in a simple binary tree with 2" terminal nodes
or approximately 2" +1 . We can now argue as before the claim that approximately t (j + 1 )
+ 2" + 1 = 2 n+1 (t(j) + 1). Now,t(l) is approximately 2" + 1 Thusf(» is approximately
^ (2" +, )'=[2" +I (2" +1 ) J ' +, ]/[l2" +l ]
which is approximately (2" +1 )> for n andj large. Thus, the total number of nodes in the dec
tree of a problem with k blocks is approximately (2" +1 ) l0S2 ' : — k n+i . We now see that the
solutions and the total number of nodes which must be implicitly examined increases with a powei
of the number of variables if the block size remains constant as the problem size is increased. Th
number of solutions which would have to be implicitly examined under no decomposition
the number of nodes in the tree under no decomposition is 2 A " +1 — 1. For example, if n= 10 and k= 3,
then decomposition decreases the size of the tree by a factor of about 500. If A =7 the factor is about 10 ! \
Edmonds [3] made the interesting suggestion that a problem be considered tractable if and only
if one can exhibit an algorithm for its solution whose running time is bounded by a polynomial in the
size of problem. There is no known algorithm for general integer programs which is polynomial hounded.
In fact, Karp [5] gives "theorems which strongly suggest, but do not imply, that these problems, as well
as many others, will remain intractable perpetually." It is interesting therefore that the class
posable integer programs just described can be solved by a polynomial bounded algorithm, namely,
simply searching the decomposition enumeration tree.
The WagnerWhitin [10] example problem, for example, is a 01 problem with 12 variables. The
first variable is required to be 1 so there are 2 n = 2,048 feasible solutions to the problem and 4,095
nodes in the simple enumeration tree. Using the seven block decomposition mentioned earlier
equivalent of only 96 solutions need be examined implicitly. The number of nodes in this
tion tree is 191. The computation involved in an enumerative algorithm should he appn
472
L. SCHRAGE
proportional to the number of nodes examined. Complete enumeration of the 191node decomposition
tree in this case is not an unreasonable solution method.
MORE THAN TWO BLOCKS PER DECOMPOSITION
In the analysis thus far we have assumed that a linking block decomposed a problem into two
independent subproblems. Much the same analysis could be done if a set of linking columns de
composed a problem into more than two independent subproblems. See, for example, Figure 3.
A 2
A„
A,
Figure 3.
For each of the settings of the variables in A t which must be examined, the subproblems A 2 ,
A 3 ,A 4 , andA 5 can be solved independently.
PROBLEMS WITH LINKING CONSTRAINTS
The more common form of decomposition structure discussed in linear programming literature
involves a set of submatrices with no rows in common, except that there is a set of constraints at
the top linking all the submatrices together. It may be that problems which are formulated with that
structure actually have a structure like Figure 2. The rows in common between /fj and Ai, A3 and A4,
etc. could be moved to the top and then one would have the more common form of decomposition
structure.
The decomposition approach described may also be useful for problems where the natural formula
tion is with Unking constraints by realizing that a linking constraint can be replaced by a linking column
and two nonlinking constraints. Consider the Unking constraint:
2n
Suppose that the problem is decomposable into two subproblems, each with n variables, except for
this constraint. This Unking constraint can be replaced by one linking variable, y\, and two nonUnking
constraints as follows:
^ajxjy
=
2n
y+ X a i x v =b 
j=n+l
Assume again that the Xj must be either or 1 and all the a, are integer. In the worst possible case
we must examine 2" different values for y, and the size of the tree under decomposition actually doubles.
DECOMPOSITION IN INTEGER PROGRAMMING 473
We would expect the number of different values of y to be examined to be much less than 2". In a
covering problem, for example, the a/s are either or 1. Then, in the worst possible case we must ex
amine n different values of y. The number of terminal nodes in the decomposition tree is then n(2"+2")
= /i2" +1 versus 2 2n terminal nodes in the nondecomposition tree. If n = 20 for example, then the number
of terminal nodes is reduced by a factor of approximately 26,000.
If one considers the usual decomposition structure studied in linear programming where there are
p independent subproblems rowlinked by a single master block with q rows, then the interested reader
should be able to convince himself that the proper generalization of the transformation described
in the previous problem is to reformulate the problem as a multilevel decomposition problem with
p — 1 sets of linking columns and approximately log2 P levels of decomposition. Each linking set would
consist of q columns.
MIXED VARIABLES CASE
We have considered only pure 01 integer programs thus far. The analysis generalizes fairly natu
rally to the case where some of the variables may take on any integer value. The analysis also extends
to the case where some of the variables are not required to be integral if fixing the integer variables
in a block completely implies values for the continuous variables in that block.
DISADVANTAGES OF THIS DECOMPOSITION METHOD AND THEIR PARTIAL
ALLEVIATION
A disadvantage of this decomposition method is that the order in which variables are placed in
the enumeration tree is partially specified beforehand. The importance of flexibility in the tree search,
especially in specifying the order of addition of variables to the enumeration tree has been pointed
out [2], [4], [6], [7], [9]. For example, if there is a variable which can take on the value of either or 1
in the optimal solution (i.e., there are alternate optima), then the amount of searching required by a
branchandbound or implicit enumeration algorithm is approximately doubled if this variable is placed
first in the tree rather than last. If there are unimportant variables in the highest level master problem,
the performance of an implicit enumeration aldorithm could be appreciably degraded by the decomposi
tion approach.
Another apparent disadvantage is the additional bookkeeping which must be done to keep track
of the decomposition tree.
The first disadvantage can be alleviated somewhat by adapting the flexible treesearch procedure
described in Tuan [9] and Bravo, et al. [2]. This approach allows one to partially reorder a tree without
researching branches already searched or discarding branches yet to be searched. With respect to
the second disadvantage, the bookkeeping scheme is not significantly more complex than conventional
ones.
The additional restrictions on the branching scheme are that
(a) we cannot branch on a variable i unless all variables in the master of the subproblem contain
ing i have been fixed;
(b) we can backtrack any time the bound on the subproblem currently being solved is worse than
the value of some feasible solution to the subproblem for the current setting of the variables in the
master problem; or
(c) we can backtrack any time that the overall bound is worse than the value of some known
feasible solution to the entire problem.
474 L. SCHRAGE
COMPUTATIONAL EVALUATION
A computer program was written incorporating the decomposition method into a backtrack
implicit enumeration scheme. The program was similar to those described by Ceoffrion 4], The revised
simplex method of linear programming with explicit inverse was used to calculate bounds. After any
variable was forced to or 1, dual pivots were performed to return to feasibility. After any variable
in a backtrack step was released from or 1, primal pivots were performed to return to optimality.
The rule for selecting the next variable to force to or 1 was simply to force to the nearer integer
value that basic variable which was closest in value to an integer.
The implementation was inefficient in at least two ways: 1) the LP portion worked with a full inverse
at all times. That is, even though only a subproblem was being optimized, pivots were done in the
full inverse. For the problems considered, each subproblem had half as many rows as the full problem,
therefore each pivot in the implemented procedure may have taken as much as four times as much
work as really necessary. 2) Natural integrality in the master problem was not taken advantage of.
Before a subproblem can be searched, each variable in its master must be fixed to an integer value.
If some variable Xj in the master was fixed to 1 and would have remained at the value 1, even if not
constrained to the value 1, all through the enumeration of all subproblems, then it would not be neces
sary to examine Xj — in the backtrack step. Most integer programming routines take advantage of
this natural integrality. The program here did not.
A class of decomposable integer programs was derived based on a problem known as IBM— 1. A
description of this problem can be found in Trauth and Woolsey [8]. The problem is a general integer
program with seven variables and seven constraints. Two 01 variables were required to represent
each of the original seven variables. A series of eight problems with the stairstep structure shown in
Figure 2 was created. Each problem was composed of three blocks. Each of the two outer blocks con
sisted of a copy of the IBM1 constraint matrix. In eight different problems, the middle linking block
consisted of the first k columns of IBM1, where k ranged from zero to seven. All problems had 14
rows.
100
WITHOUT DECOMPOSITION
1
J_
1
2 3 4 5
NUMBER OF LINKING COLUMNS
Fk.ure 4.
DECOMPOSITION IN INTEGER PROGRAMMING
475
These problems were solved on an IBM 360/65. This machine uses multiprogramming; thus run
times are random variables. The number of pivots is therefore perhaps a more reliable estimate of
computational difficulty because most of the work is involved in pivoting. This statistic is plotted versus
number of linking columns in Figure 4. As expected, the advantage of decomposition tends to diminish
as the number of linking columns increases. The same program was used to solve the problems without
decomposition; the program was simply not told that the problem was decomposable. Recall that under
decomposition, each pivot should require less work than under no decomposition.
Other computational statistics are displayed in Table 1. The linkedit time column is included
only to give an indication of the variability in run times. The linkedit step required exactly the same
amount of work in every run.
Table 1. A Comparison of Decomposition with No Decomposition
No. of
linking
columns
Pivots
Time (sec)
Nodes examined
Linkedit time (sec)
With
decomp.
With
out
With
decomp.
With
out
With
decomp.
With
out
With
decomp.
With
out
1
2
3
4
5
6
7
174
241
257
255
334
346
384
384
246
427
404
445
406
383
446
262
4.50
6.07
8.12
7.09
8.84
8.92
9.97
10.40
6.22
7.45
8.07
8.40
7.89
7.39
9.07
9.29
48
52
57
60
68
70
87
99
79
106
108
109
102
78
73
86
2.62
2.55
3.15
2.87
3.24
2.55
2.47
2.84
2.80
2.75
2.70
2.74
2.60
2.67
2.85
2.92
The run times under decomposition are perhaps longer than they need be in practice because
a full solution report was printed each time a better integer solution was obtained to any subproblem.
The runs without decomposition would typically produce three solution reports while a run under
decomposition would typically produce, say, nine solution reports. A full solution report requires a
fair amount of work because the inverse must be multiplied through the full matrix to calculate such
things as reduced costs and dual prices.
The decomposition method for these problems seems to be fairly robust in that the amount of
work seems to increase less than linearly with number of Unking columns for this class of problems.
For a small number of linking columns, decomposition is clearly superior.
REFERENCES
[1] Balas, E., "An Additive Algorithm for Solving Linear Programs with ZeroOne Variables," Opera
tions Research 13, 517546 (1965).
[2] Bravo, A., J. G. Gomez, L. Lustosa, L. Schrage, and N. Pizzolato, "A Mixed Integer Programming
Code," CMSBE Report No. 7043, University of Chicago (Sep. 1970).
[3] Edmonds, J., "Paths, Trees, and Flowers," Canadian J. Math. 27, 449467 ( 1965 ) .
[4] Geoffrion, A. M., "An Improved Implicit Enumeration Approach for Integer Programming,"
Operations Research 7 7, 437454 (MayJune 1969).
[5] Karp, R. M., "Reducibility Among Combinatorial Problems," presented at ORSA National
Convention, New Orleans, La. (Apr. 26, 1972).
476 L SCHRAGE
[6] Salkin, H. M., "On the Merit of Generalized Origin and Restarts in Implicit Enumeration," Opera
tions Research 18, 549555 (MayJune 1970).
[7] Spielberg, K., "Plant Location with Generalized Search Origin," Management Science 16, 165178
(1969).
[8] Trauth, C. A. and R. E. Woolsey, "Integer Linear Programming: A Study in Computational
Efficiency," Management Science 15, 481493 (May 1969).
[9] Tuan, Nghiem Ph., "A Flexible TreeSearch Method for Integer Programming Problems," Opera
tions Research 19, 115119 (Jan.Feb. 1971).
[10] Wagner, H. and T. M. Whitin, "Dynamic Version of the Economic Lot Size Model," Management
Science Vol. 5, No. 1 (Oct. 1958).
NUMERICAL TREATMENT OF A CLASS
OF SEMIINFINITE PROGRAMMING PROBLEMS*
S. A. Custafson
The Royal Institute of Technology
Stockholm, Sweden
and
K. O. Kortanek
CarnegieMellon University
Pittsburgh, Pennsylvania
ABSTRACT
Many optimization problems occur in both theory and practice when one has to optimize
an objective function while an infinite number of constraints must be satisfied. The aim
of this paper is to describe methods of handling such problems numerically in an effective
manner. We also indicate a number of applications.
1. INTRODUCTION
In order to illustrate the subject of this paper, we immediately give a few examples on the class
of problems we wish to study.
EXAMPLE 1.1: One wants to determine a cumulative distribution function G which corresponds
to a stochastic variable, which can assume values inside a finite closed interval [a, 6]. In N points
tu *2, . . ., *v, one has measured the values of G and obtained g\, gz, . . .,g.w. We want to approximate
G in [a, b] by a polynomial P of a degree less than a certain number n. It is natural to write
r=l
and require that P(a) — 0, P(b) = 1, and P' (t) 2* 0. Since we cannot hope that P passes through the
measured points, we try to solve the problem
N I n \ 2
inf W^ r <;' 6 ,
»1'»2 »nj=l X r=l /
subject to
2 y r a r  1 =
r=l
2 (rl)y r r 2 3s0 a^t^b
r=2
J y r bri=l.
*This research was supported in part by National Science Foundation Grant GK 31833.
477
478 S. A. GUSTAFSON AND K. O. KORTANEK
It is easily shown by examples that this problem may or may not be feasible, depending on the given
data.
This problem appeared when one wanted to study size distributions of grains in gravel deposits
in order to get a suitable raw material for concrete production (GustafsonMartna [27]). In the refer
enced paper, one did not attempt to solve exactly the problem indicated above, but used piecewise
polynomial interpolation through the measured points instead, in order to meet the monotonicity
condition.
EXAMPLE 1.2: BojanicDeVore [4] discuss the problem of onesided approximation of a given
function from below, while maximizing a linear functional. Their problem can be stated as follows:
Let [a, 6] be a closed interval, «i, u%, . . . , u„ n given functions which form a Cebysev system (for a
definition see, e.g., KarlinStudden [32] or Gustafson [20]). Let further <b be continuous on [a, b\
Determine
n fb
max 2. Vr I u r (t)dt,
subject to
^yruAt) ^<f>(t), te[a,b).
r=l
BojanicDeVore give some unicity and existence results and identify the solutions with certain quad
rature rules. Methods for finding computational solutions to this problem are given in Gustafson [20].
EXAMPLE 1.3: Let again [a, b] be a closed interval, <a a positive function defined on [a, b]
and 4> continuous on the same interval. We want to determine a polynomial of degree less than n which
approximates <f> as well as possible in the weighted maximum norm determined by <u. That is, we
want to solve the problem.
Compute min tj subject to
»(0I 2 y* r  l 4>(t)\*n U[a,b}.
We can write this task in the equivalent form:
Compute min tj subject to
n a
2) yr' r '<«>(f)i7s£<M*)o(0 and J y r t r  l a>(t) tj =S <ft (t)<o(t).
T=\ r=l
This problem is well known and for a>U) = l, a solution is constructed computationally by means of
Remez' algorithms (see, e.g., Cheney [9]). For numerical purposes, an approximative solution often
is satisfactory (Powell [41]), GustafsonDahlquist [23]).
SEMIINFINITE PROGRAMMING PROBLEMS 479
EXAMPLE 1.4: KantorovichRubinshtein [31] and Rubinshtein [43] give examples of production
scheduling problems, where an infinite number of linear constraints must be met. Also Vershik
Temel't [48] propose a process of finding a sequence of approximate finite linear programming prob
lems whose optimal values converge to the optimal value of the infinite linear programming problem.
EXAMPLE 1.5: GorrKortanek [18], GustafsonKortanek [24] and GustafsonKortanek [25] give
examples of models for study of air pollution problems, where an infinite number of linear constraints
must be fulfilled over a twodimensional set S.
Additional Examples and Problems
As illustrated above, moment problems stem from problems in approximation and minimization,
see ShohatTamarkin [45], RivlinShapiro [42], Shapiro [441, KarlinStudden [32] and others. This
leads to applications of infinite programming techniques to analysis, Duffin [13], [14], Kretschmer
[36], [37], DuffinKarlovitz [15], including the development of orthogonality theorems and similar
results with application to the theory of integral equations. While applications of the moment problem
to statistics and probability theory are wellknown, recent problems in these areas have been brought
into contact with the theory of moments by Krafft [35]. Interesting applications of generalized moment
problems have also been made in theoretical physics, see BakerGammel [1]. See also the classification
theory of Ben IsraelCharnesKortanek [3 for linear programming problems over closed convex sets
in locally convex spaces and applications to approximation theory.
DEFINITION 1.1: We denote by problem D the general task:
Compute
»,,i/^..,i/„ G{y u y t , ■ ■ ■', y»h
subject to
(1.1) J y r u r {x) ss 4>(x) xeS.
r=l
Here S is a given set, «i, ui, . . ., u„, <p are given functions defined over S. The objective function
G is also given and must be defined for all vectors yi, yi, . . ., y n which satisfy (1.1).
We will refer to problem D as a semiinfinite program. The fact that n is finite is crucial in our
analysis. We observe that all the examples, 1.1 through 1.5, are instances of problem/).
Many wellknown optimization problems are subsumed by problem D. If S has a finite number of
elements, we arrive at mathematical programming tasks of various kinds. Note in particular that if
G also is linear, we get linear programs. For a discussion of these problems, the reader is referred to
the textbooks by CharnesCooper [6] and Dantzig [10].
We want, instead, to discuss the case when S has an infinite number of elements. The general
theory of semiinfinite programming embracing such problems is given in several papers by Charnes,
Cooper, and Kortanek [7], [8]. Included in their theory is the development of regularization techniques,
analogous to those of finite linear programming, which we use in our computational developments.
In section 2 of this paper we present the parts of the general theory which are relevant for our
purpose. We discuss the intimate connection between problem D and certain socalled moment prob
480 S. A. GUSTAFSON AND K. O. KORTANEK
lems. We establish that the solutions of problem D can be found if one can solve a system of a finite
number of scalar equations in a finite number of unknowns. This system is nonlinear even if G is
linear and its numerical solution is a nontrivial task. In section 3 we propose an algorithm to be used in
practical computational work. The basic idea underlying our algorithm is that D is approximated by a
problem with a finite number of constraints. The solution hereby obtained is then used as an initial
approximation which is then improved by NewtonRaphson iterations (other iterative methods might be
considered). Thus the solution of semiinfinite programs can be achieved by combining wellknown
standard techniques. In particular problems, special short cuts can be used in order to facilitate the
computations (see, e.g., Gustafson [20], [21]). We also discuss questions in connection with error
estimation. We treat the problem of assessing how perturbations of input data influence the optimal
solution and corresponding value of the objective function.
2. GENERAL RESULTS
DEFINITION 2.1: Let K be the set of vectors y (ri, y 2 , . • ., y«) which satisfy (1.1). We refer
to K as the constraint set of D.
We note that K always is contained in R", the rcdimensional vector space (independent of the
nature of S). If K is nonempty, it is convex. We give three simple instances of D in order to illustrate
different situations that can occur.
EXAMPLE 2.1: Find
inf yi + y 2 ,
when
y,x + y 2 * 2 ^l, xe[0, 1]
(inconsistent).
EXAMPLE 2.2: Find
inf yi,
when
yi + y2*^Vx, xe[0, 1]
(the inf.value is 0, but it is not attained).
EXAMPLE 2.3: Find
inf yi + 1/2 y 2 ,
SEMIINFINITE PROGRAMMING PROBLEMS 481
when
yi + y*x^— , xe[0, 1]
1 T X
(The inf.value is 3/4, and is assumed for y x = 1, y 2 = — 1/2).
Another example is treated in lemma 3.5 in this paper. It is quite obvious that problems such as ex
amples 2.1 and 2.2 above will cause difficulties in actual machine computaton. We therefore want to
specialize problem D, but in such a way that we still retain wide generality and consider only what
we call regularized problems. (Compare Gustafson [20].)
DEFINITION 2.2: We denote by problem D F a special case of problem D having the properties
1,2,3 below:
1. S can be written as Sk U Sf, where Sk is a compact subset of the A;dimensional vector space
(k < °°) and Sf has a finite number of elements. The conditions
(2.1) 5>rM*)^<M*)> *cSf,
r=l
must be such that they alone restrict y to a bounded region K F of R". We require, however, that (2.1)
is consistent.
2. We require further that U\, u 2 , . . ., u n and <j) are continuous over Sk and that U\, Uv, . . ., u„
meet Krein's condition: There exist constants C\ , c 2 , . . . , c„ such that
n
2 c r u r (x) >0, xeSk
r=l
3. G must be differentiable and convex on K.
We note that for a task of type Dp, K is a compact bounded set. Hence the minimum value Z is
always assumed.
Instead of Krein's condition, Gustafson [20] requires that «i, u 2 , . . ., u„ form a Cebysev sys
tem (is a unisolvent set). Unfortunately, this cannot be done if k> 1. We quote the following classical
result from Buck [5] (the notations are slightly changed; C(ft) denotes the space of functions, con
tinuous over ft).
LEMMA 2.1: If ft is a compact connected set and C(ft) contains a unisolvent linear subspace of
finite dimension at least 2, then ft is homeomorphic to the unit interval or the unit circumference.
Therefore, many results in this section will be generalizations of those in GustafsonKortanekRom
[26].
In order not to get unnecessarily complicated formulae, we treat only the case that the inequalities
corresponding to xcSf are of the type (2.2) below. We want first to treat the case
n
G(y U Y2, ■ ■ ., y»)= ^ VrJr,
r=\
and then generalize the results to a general Dfproblem. Denote the problem by D Fo .
482
Compute
S. A. GUSTAFSON AND K. O. KOKTANEK
n
Z = min V yr/Jr
subject to
V y r ii r {x) 2* $(.*), xeS k and
(2.2)
Ft^yr^F* r=l, 2, . . ., n
We want to show that the optimal solution of D hl) can be found by solving a nonlinear system of equa
tions. In order to derive this, we apply the theory of semiinfinite programming. For this purpose we
need a few notations.
Let 2 be the set of all finite regular measures which meet the integrability conditions
\u,{x)  da(x) < °° r= 1, 2, . . . , n
\4>{x)\da{x) <°o.
Denote by M„ the convex cone in R":
M„=l(T= (<ri, (72, • • ., o„) ov= I u r (x)da(x), r— 1, 2, . . ., n, cre 2 [•
Compute inf £ YrHr
y.y„
subject to V y r fi n (x) ^ <\){x) xeS k .
Introduce now the problems P and D :
Compute sup I 4>(x)da(x)
a J S k
subject to I Ur (x)da(x) = fJL r r=l, 2, . . .,n,
J s k
From KarlinStudden [32, p. 472], we quote the result (the notations are slightly changed).
LEMMA 2.2: Let /jl— (fJL U //, 2 , • • ., /a«) belong to the interior of M„. Then the optimal values of
P» and D are equal.
Arguing as in Gustafson [20], we can associate with D Fo the semiinfinite dual problem Pp :
Compute
subject to
max J <\){x)dot(x) + JT {F x v? — F 2 vf),
JS k r =i
I u r (x)da(x) + v} — v~ = fj. r r=l,2, . . .,«
Js k
ae^ v+ 5* 0, v~ 3* r= 1, 2, . . ., n.
SEMIINFINITE PROGRAMMING PROBLEMS 483
Pf„ is always feasible, but may be unbounded. Exactly as in Gustafson [20] and CharnesCooper
Kortanek [8], we establish:
LEMMA 2.3: Let Dp have interior points. Then P F and Dp are consistent and bounded. They
assume their optimal values which are equal.
We also obtain, following [8] , [20] :
LEMMA 2.4: Among the optimal solutions of fV , there are such which correspond to point
masses with finite number of masspoints. Furthermore,
LEMMA 2.5: Let D fo have interior points and let an optimal solution of Pf be given by
i) a pointmass distribution with mass mi at jc', i= 1, 2, . . . , q,
ii) q + of the v$ are positive, namely v?., v%„ . . ., t> +
iii) q~ of the Vr are positive, namely v 7, , Vg 2 , . . ., v~ .
Let y=(yi,y2, • • ,yn) be an optimal solution of D Fv Then the following equations are satisfied:
(2.3) J miUr(x') + V$ — Vr= Vr T= 1,2, ...,«.
(2.4) % yriiAx*) = <t>(x>) »=1,2 q.
(2.5) y rj = F l >=1,2 q\
(2.6) y* k = F 2 A=l,2, . . .,<r.
REMARK: We have also: q + q+ + q~ < n and m< > 0, i= 1, 2, . . ., q. The q columnvectors
tti,u 2 , . . ., u q in (2.3) given by Ui = (ui (**)> u 2 (xi), . . ., u„{x,)) are linearly independent.
The relations (2.3), (2.4), (2.5), and (2.6) are necessary conditions for finding optimal solutions.
They can be supplemented by further conditions.
Let «i, u>, . . .,«,, and </> have continuous partial derivatives of the first order. Define Q by
n
r= 1
Then relation (2.4) takes the form
Q(x i ) = <f>(x i ) i=l,2, . . .,<?.
If x belongs to S* we have
Q(x)**4>(x).
Let jc' be such that there is a nonzero vector h meeting the conditions:
(2.7) x l + heS k , x'hcS*.
484 S. A. GUSTAFSON AND K. O. KORTANEK
Define \\t by
$(t) = Q(x i + th)$(x i + th) Kt*sl.
\\t has a continuous derivative with respect to t on [— 1 , 1] , «//(<) 3* and «/»(0) = 0. Hence «/»' (0) = 0.
This observation can be utilized to derive further constraints on Q in the following manner. De
termine for each point x l a system of linearly independent vectors h which meet the requirements
(2.7). Denote these vectors by hi, h 2 , . . . , hi t (if there is none put /, = 0). We have always li =£ k, the
dimensionality of S*. We note in passing that li< k at boundary points only. Denote the directional
derivative along hj by Dj. Then we must conclude
(2.8) Dj(Q(xi)(f>(xi))=6 ;=l,2,...,/« if/js*l.
i=l, 2, . . ., q
Equations (2.3), (2.4), (2.5), (2.6), and (2.8) form a nonlinear system and the optimal solution of the
original problem can be found by solving it. The unknowns are the q masses mi, m^, . . ., m q the n
scalars yi, y2, • • ., y«, the vectors x l , x 2 , . . . , x q , and the numbers vf, v£, . . . , v+, t;,~, v^ . . ,v~.
We mention now two special cases: In the problems treated by Gustafson [20] and GustafsonRom
Kortanek [26], we have k=l. Hence lj = at boundary points, lj— 1 at interior points. If S* is strictly
convex (e.g., a nondegenerate ellipsoid), then lj— at boundary points and lj = k in the interior.
We next extend our results to nonlinear functions G. From KortanekEvans [34, p. 889], we get
(after appropriate changes of notations):
LEMMA 2.6: Let G be a continuously differentiable function defined on an open convex set W
in R n . Consider the two problems:
(/) (/ )
*
min G(y) min y 7 '(VG)„=„ !(c
when yeK when yeK
where K is a closed convex set in W. Then y* is optimal for / if and only if y* is optimal for /* provided
either one of the following conditions hold
(a) G is pseudoconvex
(b) G is quasiconvex and (VG) y=y * ^0.
Using this lemma, we realize that if G meets conditions (a) and (b) of the lemma, we can replace D F
by a linear problem with the objective function G* defined by
C*(y,, y 2 , . . .,yu) = y. T*yr,
where y* , y* , . . . , y * is an optimal solution of D F . In our nonlinear system (2.3), (2.4), (2.5), (2.6),
SEMIINFINITE PROGRAMMING PROBLEMS 485
(2.8), we should replace /x r by — * in (2.3) in order to allow for nonlinear objective functions. (The
oy r
remaining equations were derived independently of the objective function G.)
DEFINITION 2.3: We denote by system NL the nonlinear system of equations obtained by COmbin
ing Equations (2.3) through (2.6) with (2.8). If G is nonlinear, fi r in (2.3) should be replaced by — * as
dy
described above.
EXAMPLE 2.4:
Compute min4yi + 2/3(y4+y«),
when yi + xiy 2 + x 2 y3 + xfy 4 + Xix 2 y<i + x'iy 6 2s3+ [x\ — x 2 ) 2 (xi + x 2 ) 2
S 2 ={(*,,* 2 ) *i«l, i=l,2J.
The associated moment problem P? reads
compute max [3— (x, — x 2 ) 2 (xi + x 2 ) 2 ]da(xi, x 2 ),
a Js 2
when I da(x\,x 2 ) = b
s 2
I Xida(x\, x 2 ) =0
*2
Jx 2 da(:ci, x 2 ) =
s 2
f *2<M*,,* 2 ) = 2/3
Js 2
Jaci« 2 rfa(aci, x 2 ) =
s 2
I xda(*i,x 2 )=2/3.
J5 2
By inspection we find that Pf has the feasible solution with four mass points
m X\ x 2
1 V6 V6
1 Vo" Vo"
486 S. A. GUSTAFSON AND K. O. KORTANEK
1 V6 Vo"
1 1 l
1 ~V6 "Vo"
Djp has the feasible solution y } = 3, 72 — yz — ■ • • = y« = 0. The preference function assumes the value
12 in both problems, that is, we have found an optimal solution.
We observe that da(x t , x 2 ) = d{%\, x 2 ) is another feasible solution of Pp and hence we have found
the quadrature rule with positive weights:
JL* < ** )rf( *'* ) *(4^) + *(^^) + *(4^) + *(^^) i
The rule has positive weights and is exact if 4> is a polynomial of two variables and of degree less
than 3.
3. COMPUTATIONAL SOLUTION
3.1. Definition of Acceptable Approximations
In this section we give the general principles of a computational scheme for the solution of the
problem Dp (Definition 2.2).
DEFINITION 3.1: Let y be any vector. We define the nonoptimality AZ (y) as \ZG{y) ,
where Z is the optimal value of Dp.
DEFINITION 3.2: Let again y be any vector. We define the discrepancy of y by
8(y)=mjn JT y r u r (x) — <f>(x) \xeS k
Thus y is an optimal solution vector if 8(y) 3* and G(y) = Z. Generally, one has to be content with
trying to find a vector y such that 8(y) 5= — 8o and C(y) — Z\ ^ €o, where the positive numbers 8o and c w
are given tolerances. (Such a y is called an acceptable approximation.) We want to show that this can
be done by means of a finite number of operations provided these are carried out with sufficiently good
accuracy.
3.2 Cuttingplane Methods and Alternating Procedures
Dp amounts to minimize a convex function over a compact convex set K. This is not specified in
the form of a few simple equations. Instead, we know the supporting planes which are:
n
V y r u r (x) — <b(x) 3 s xeS.
r= 1
One can, therefore contemplate using the principles of the cuttingplane method of Kelley [33] and
Wolfe [51] with the accelerating device by Wolfe [51]. The first algorithm by Remez (see, e.g., Cheney
[9], p. 96) can be considered as a variant of the cuttingplane method.
We generalize this algorithm and define the following alternating procedure. (The word alternating
refers to the fact that the optimal solution of Dp is computed by alternatively minimizing G over subsets
of Kp containing K and minimizing certain functions over S.
SEMIINFINITE PROGRAMMING PROBLEMS 487
The general step is: let x l , x 2 , . . ., x sl (5 ^2) by given elements in S. Take y s as an optimal
solution vector of the problem
min G(y),
subject to
JT y r u,(* j )<M* j ) ^0 ;=1,2,. . .,51.
» = i
Then define X s as an element in S which minimizes
]? y s r u r (x)(f){x) xeS.
r=l
If this last minimum is nonnegative, the process is terminated. Otherwise we generate y* +1 , y* +2 , ....
THEOREM 3.1: The sequence y s , y sfl , . . . generated by the alternating procedure above con
verges toward an optimal solution of Dr.
PROOF: Since y s + 1 meets all the constraints of y s , G(y s + 1 ) 3= G(y s ). The same is true for any
optimal vector of D F . Hence G(y*) =£ G(y s+1 ) *£ . . . =S d , where d is the optimal value. Three cases
are conceivable, namely:
CASE A: The alternating process stops after a finite number of iterations.
CASEB: \im G(y s ) = d q, rj > 0.
CASEC: Em G(y s )=d.
If Case A occurs, the optimal vector has been reached because the last vector satisfies the constraints
of K. We want to show that Case B is not possible. Since y J is an infinite sequence confined to the
compact set Kf, it has accumulation points. Let y* be such a point. Put G(y*) = d — €, € > 0. Hence y*
does not belong to K.
n
Let x be an element in S which minimizes ^ y*u r (x) — <f>(x). Denote the corresponding minimum
r=l
by A. We must have A <0, since y*4K.
From the definition of y* we conclude
(*) J yturW) <M*0^0 7=1,2, ....
r=l
Let now {y lj } be a subsequence such that y'J—>y* and x l} tend toward an accumulation point x*. We
find for each j:
X y l Ju r (x)<t>{x) ^ 2 y' r Ju r (x'J)<t>(x'J).
r=\ r=l
Letting j — * 00 we arrive at
A = § y?u r (x)<t>(x) * 2 v*u r {x*)<f>{x*).
488 S. A. GUSTAFSON AND K. O. KORTANEK
But by (*)
V y?u r (x*) <f>(x*) ^ also,
r=«
since y'J — *y*, x lj —*x*. This contradicts that A < and hence Case C is not possible. Theorem 3.1 is
therefore proven.
The alternating method might be effective when an approximate solution vector of D F is known.
If the objective function of D F is linear, one can use the algorithms given in Gustafson [20] and
GustafsonKortanekRom [26]. As a matter of fact, if the objective function is convex, D F can be solved
by solving a sequence of semiinfinite programs. We show now that the solution vector of Dp can be
constructed as an accumulation point of the sequence y°, y 1 , . . . constructed recursively as follows.
Let y° belong to K. See def. 2.1.
When y°, y 1 , . . ., y _1 (/=l, 2, . . .) are determined, we define the linear functions
"j(y)=G(yJ)+£ (yy/)(f£) .
Then we define y* as the optimal solution of the problem
min 7T/_i (y),
subject to
tt,_, (y)^ ttj (y) 7=0,1, . . .,12
n
2 y r U r (x) 3=<M*) XCS K
r=\
n
2) y r u r (x) ^<f>(x) xeS F .
r=l
From KeUey [33] we conclude that {y'} contains a subsequence converging toward an optimal solution
vector of D F .
The methods discussed here can be of use to construct an approximate solution of the system
NL, which is then solved by means of NewtonRaphson which is rapidly converging.
3.3 Approximation with Problems with a Finite Number of Constraints
3.3.1 Generalities
We first introduce:
DEFINITION 3.3: A finite subset T of S K , T= {*', * 2 , . . ., x N } is called a grid.
DEFINITION 3.4: We denote by problem D F  T the task:
Compute minG(y),
SEMIINFINITE PROGRAMMING PROBLEMS 489
n
Subject tO 2 yrUr(xi) 5S <M* J ) X J €r
r=l
n
^ yrM*') ^ <f>(x) xeS F 
r=l
(Sf is the same set as in the definition of problem Dp.) This problem can be solved by standard tech
niques of mathematical programming. The solution of D F — T can be used to approximate that of D F .
We shall now derive bounds for the discrepancy and nonoptimality expressed in grid data and char
acteristics of the functions U\, u 2 , . . ., u„, <f> and G.
3.3.2 Error Bounds for Optimal Solutions of D r — T
We need two definitions
DEFINITION 3.5: Let a grid T and a norm be given on S fc . The number r given by
IT^maxmin * — Xj\\
xtSi, zjtT
will be called the coarseness of T.
This definition of r agrees with the concept of "density" in Cheney [9, p. 84], but it is not con
sonant with the definition in Gustafson [20, p. 350]. The latter could not be directly extended to multi
dimensional grids. Since S* is finitedimensional, all norms are topologically equivalent and therefore
any norm can be used.
DEFINITION 3.6: Continuity modulus of a realvalued function i». If \\t is continuous on S* we
define the function a>$ as follows:
Gty(z)=SUp l//(*')l//(x"),
subject to
x'eS K , x'eSk
\\x'x"\\^z.
Gty is called the modulus of continuity of i//. aty is nonnegative and increasing. Since t// is continuous
lim (t)^(z) =0 when z— » + 0. (Compare Cheney [9, p. 86].)
We can now prove:
LEMMA 3.1: Let y T be an optimal solution of D F — T. Then
(3.1) 8(y T ) ^A T (r), where
(3.2) A,(71) = X y r T ku r (7l) + a>*(r).
r=l
PROOF: (The arguments are a slight generalization of those in Gustafson [20, pp. 351352].) Put
n
490 S. A. GUSTAFSON AND K. O. KORTANEK
Let xeS k . We want to get a lower bound for <//(*)• By the definition of T there is a gridpoint *•>' such
that « — *>' *£ \T\. We write
i//(*) = <M* J ') + <//(*)  «M* j ),
giving
<//(*) ^4j(xJ)\^(x)iI,(xJ)\.
Hence
Mx) &4i(x J )v*(\\xx i \\) ^a>U\T\)
since </>(* j ) ^ 0> II*  * j  ** \T\ and o\, is positive and increasing. We find immediately
<o*(\T\) ^Ardri). Q.E.D.
This result can be strengthened to
LEMMA 3.2: Let y T be an optimal solution of D F — T. Then
(3.3) 8(y r ) ^l±{\T\), where
(3.4) A(r) = Gi*(7)+f ^oj Ur (\r\), a ndF=max{\F 1 \,\F 2 \}.
r=\
PROOF: Use the fact that y r  =£ F.
The bound in Lemma 3.2 is more conservative than Lemma 3.1, but it is a priori in the sense that we
do not need to know y T in order to evaluate it. Hence we can tell in advance how small T must be
selected in order to get a discrepancy below a given tolerance. We next derive bounds for nonoptimality.
LEMMA 3.3: Let y T be an optimal solution of D F — T. Let
n
(3.5) H = 2 C r U r
r=\
be positive* over Sk and put
(3.6) y= min H{x).
xtS k
Put
(37) y» = y?+y'c r A 7 .(r), r=l,2, ..!,»,
where A 7 is defined by (3.2). Then
1
(3.8)
where y is an optimal solution of D F .
l(G(yT) + G(y"))G(y)
2 (G(y»)G{y T )).
*The existence of H is guaranteed by Krein's condition.
SEMI INFINITE PROGRAMMING PROBLEMS
491
PROOF: Put
<?"=2yr"«r.
We want to show that Q H {x) 2* <f>(x), xtSk. We write
giving
Thus
Q H (x)^(x)=Q H (x)Q T (x)+Q T (x)^x),
Q»ix)4>{x)=y*H(x)*T(\T\)+Q T (x)<Hx).
Q«(x)4>{x)&A T {\T\)(y l H{x) l)z*0.
Hence y" is a feasible solution of D t and therefore
G(y»)^G(y).
On the other hand, y T is the optimal solution of D r — T and we can therefore conclude
G{y»)>G(y) ^G(y T ),
from which (3.8) follows.
Q.E.D.
Lemma 3.3 can be used to derive a posteriori bounds for nonoptimality. We can namely prove Lemma
3.4.
LEMMA 3.4: Let y r be an optimal solution of Dp — T. Then we can replace Ar in (3.7) by A in
(3.4) and (3.8) gives an a priori bound. Further, if v is such that
dG
dy r
r—1, 2, . . ., n
everywhere, then
7i(G(y T )+G{y»))G(y)
±M\T\)±\c r V r \.
Often bounds of the partial derivatives of 14, . . ., u„ and <j> are known. Then the expressions for
A and Ar can be simplified. Let x' and x' + h belong to S. Then the meanvalue theorem gives
<M*' + M<M*')=£(f^r) w
where £= x' + Bh for some number in (0, 1). Put
(3.9)
K*=SUp SUP J)
J'V h
d<f>
dx r
492
Hence
S. A. GUSTAFSON AND K. O. KORTANEK
<M*' + fc)<M*')l«lAK*, giving
oi*(/i)« Ak*
Defining *c» r in the same manner as k>, we obtain
(3.10)
(3.11)
A r (0^*A; (SO,
A(f)^*A' (SO
where
A' r =K,»+ £ yrK U)
A' = k* + f£ K Ur ,andF = max{F,,F 2 }.
If we replace A 7 and A with the bounds (3.10) and (3.11) and revise the arguments in the preceding
four lemmas, we arrive at Theorem 3.2.
THEOREM 3.2: If «i, u 2 , . . . , u n and <f> have bounded partial derivatives of the first order and
y T is an optimal solution of Dp — T, one can give explicit a priori bounds on the nonoptimality and
discrepancy of y T . These bounds are proportional to \T\.
A further refinement is possible if Mi, ui, . . ., u n and <f> have continuous partial derivatives of the
second order.
Let y r be an optimal solution of D F — T. Put
n
* = 2 yrUr ~ $•
Then «J/(* J ) 5* 0, for all x j eT. A lower bound of </»(*), xeSfc can be constructed from the following
result.
LEMMA 3.5: Let x\ x 2 , . . ., x k+1 be k + 1 given points in S* and h a number such that
< *?xfl *£ h i=l,2,...,A; r=l,2, . . ., &+1
and the determinant
X 1
1 xf^
r k+l
r k+l
#0.
SEMIINFINITE PROGRAMMING PROBLEMS
493
Take a fixed point x in the convex hull U of x l , x 2 , . . ., x k+l . Put R(x) = sup/(*), when/varies over
all functions with two continuous partial derivatives of the second order such that
(3.12)
dXidXj
* 2ch xeU,
where Cy i=l, 2, . . ., k; j—1, 2, . . ., k are given constants and /meets the condition
(3.13)
then
(3.14)
/(*>) = ;l,2 f . . .,A+1,
i=i j=i
where Xi, X 2 , . . ., A*+i are determined by
fc+i
*= £ A r * r .
r=l
REMARK. To determine /?(*) over a fixed x is an instance of problem D in section 1 when S is
infinitedimensional. S is namely the space of all functions of k variables which have continuous partial
derivatives of the second order.
PROOF: Without loss of generality, we can assume that the coordinates are chosen such that
x l = 0. Taylor's formula (expanding about x x ) gives, since /(*') = 0:
k k
where
f(x r ) = ^ajx } +^^bijXiXj, r=2,3,
j=i iiji
1 \dXj)x=xi
br.=l(J^L\ 0***1
u 2 \dxidXj/ x=e r xi+ue )x r r
.,*+!.
Hence we arrive at the problem:
Compute
subject to
(3.15)
k k
L = sup ^ ajXj+ £ 2 btjXiXj,
j=i <=ij=i
k k
2°*5+2 SVW 8 * r=2 ' 3 k+1
j=l i=lj=l
IM * Cy, 6 v *Cy.
494
(3.15) gives
S. A. GUSTAFSON AND K. O. KORTANEK
Since x is in the convex hull of {x r } r— 1, 2, . . ., A + 1, we can write
*=£ X r * r where £ A r =l, \ r ^ 0, r= 1, 2, ...,&+ 1,
r= l
r=l
giving
Thus we are left with the task
Compute
subject to
k k+l k k
j=l r=2 i=l j=l
k k I k+l \
L= 9 up 2 2 l*« x * x j~ 2 x »^n
i = 1 j = 1 ^ r=\ '
l^jl^Cij.
Hence we should take
bij = ctj sign f xixj — ]£ k r x[xjj
Entering  bij  = \bL \ = Cy, we get the bound sought. The determinant condition implies that Ai, A2,
Xfc+i are uniquely determined by x and x 1 , x 2 , . . ., ** +1 , and hence the lemma is proven.
If we now make the substitutions x r =h£ r , x = h% in (3.14), we get
R(X) ^ A 2 t t (dj ££ + Y krttfr \
i=\ j=l V r=\ I
where
fc+i
^=S^
If now the bounds on the derivatives hold uniformly over S*, we get
Hy T ) z*e\T\\
where e is determined by the distribution of the grid points T and the bounds of the secondorder
partial derivatives on S*. Hence, if we consider a sequence 7\, T 2 , . . .of hypercubic grids the bound
on 8(y r, ') decreases as the square of 1 7\ , i = 1, 2, . . .. Revising the arguments leading to Theorem
3.2 above, we find that the same holds true for the bound on the nonoptimality.
SEMIINFINITE PROGRAMMING PROBLEMS 495
3.3.3 Convergence Results when T — »0
LEMMA 3.6: To every e > there is an h > such that if \T\ < h and y T is an optimal solution of
Df—T, there is ay which is an optimal solution of Dp and satisfies  y — y T \\ < e.
PROOF: The same arguments apply as in Gustafson [20, Theorem 3.3]. This result can be both
generalized and sharpened.
If Dp — T has an optimal solution y r , we can associate with it the problem Pp — T given below.
DEFINITION 3.7: Let the grid T be {x\ x 2 , . . ., x N }. We denote by problem P F T the task:
N n
Compute max ]? mj^ix 3 ) + ^ {Fit* — F 2 v~ ) ,
j = 1 r = 1
N fdG\
subject to V mjU r (x j ) + vt — v~ = [ —
ntj^ v? 5= v~ ^ 0.
Pp — T is a linear program (even if G is not linear) and has hence an optimal solution which corresponds
to a point mass distribution with, at most, n masspoints. Select such a solution of Pp — T with the mini
mum number of masspoints. Then with each T we can associate the vector z(T) given by
(3.16) z{T)={i(T),tHT), . . .,t»(T),y(T),
where £ J are the masscarrying points. An optimal solution of Pp — T and Df — T is uniquely determined
if z{T) is given, since the vectors ^(T), . . ., g v (T) are linearly independent and v+, v~ enter the
solution it y r = Fi or F2, respectively.
Let  I be a norm on Sk We define \\ z\\ by
11*11 = 2 llfll+i \yr\
THEOREM 3.3: Let 7} j=l, 2, ... be a sequence of grids such that  7j  > when y>oo.
With each Tj we associate the vector z{Tj) defined analogously with z(T) above. Then we can find a
subsequence z(Tj t ) converging towards a vector 2 which describes a solution of Pp—Dp.
PROOF: Since v^ n,z(T) has at most n(k+ 1) components. S* is a compact set andFi ^ y r ^ F 2 .
Hence we can find a number B such that  z(Tj) \\^B. Therefore {z(Tj)} is confined to a bounded
subset of a finitedimensional Banach space. Thus {z(Tj)} is contained in a compact set. We first select
a subsequence Tj,, 7j 2 , Tj such that y{Tj ) converges towards a vector y. Using the same arguments
as in Theorem 3.3 in Gustafson [20] , we establish that y is an optimal solution of Dp.
To each y(Tj s ) there is an optimal solution of P—Tj s . We want to define vectors z(Tj s ) according
to (3.16) which we do recursively as follows:
Let z(Tj g ) be given and of the form
z(T jg ) = V(Tj),{HTj), . . ., ?(T 3 J, y(Tj).
496 S. A. GUSTAFSON AND K. O. KORTANEK
Let Pf — Tj g+l have an optimal solution with the masscarrying points u l , u 2 , . . ., u". We now define
£ l (Tj  ) equal to a vector from the set {«', u 2 , . . ., u"} which minimizes
ii* , ai,)»ii,
when 1 ^ a =£ v.
This is done for /= 1, 2, . . ., min (v, v).
There are three cases
(a) v < v. Then the vectors in z(Tj s ) which have not been matched are put in the vector for
(b) v=v. No subsequent change in the definition of z(7j s+1 ) is done.
(c) v > v. The vectors from {u 1 , u 2 , . . ., v v } which have not been matched are transferred to
z(Tj s+l ). Hence, in all cases max (v, v) vectors are put in the vector for z(Tj t ) which also contains
In the manner described above, we define recursively z(Tj t ), z(Tj 2 ), .... In no case will any of
these vectors contain more than n points.
We can now take a subsequence which converges towards an accumulation point which we call
z* from which a solution of Dp — T and its corresponding pointmasses can be constructed.
REMARK: From the construction of z* it is obvious that certain of the points represented in the
vector z* carry the mass zero. They can hence be removed. Other points can be confluent. If this is
the case, only one member from every group of confluent points is carried. The last mentioned case
is common. Compare Gustafson [20] and subsection 3.3.5.
The idea to approximate a semiinfinite program with an optimization problem with a finite number
of constraints is, of course, not new. Convergence results of the same character can be found, e.g., in
VershikTemel't [48] and Cheney [9, pp. 8688].
3.3.4 Special Devices to Economize and Stabilize the Simplex Method
In this subsection we consider the case when G is linear, that is
n
G(y) = 2 y r p. r .
(As noted in section 3.2, every convex semiinfinite program can be solved by solving a sequence of
linear semiinfinite programs.)
In this case, Dp — T is a linear program and an optimal vector y T can be constructed with the
Simplex method.
Let now y be a candidate for y T and put
n
(3.17) «/»(,= ^ y r u r <b.
r=l
We want to investigate if *}fy(xJ) 5* 0, x^T. Let x'eT be such that tyy(x') > 0. For all x in S* we find
the bound
<M*) > *„(*')«*,,( 1 1**' ).
SEMIINFINITE PROGRAMMING PROBLEMS 497
We note that
n
<«%(*) ^ X ly»G)tt r («) + o>^ ( „), t > 0.
r=l
If we put
Si={*ft^(»^)^ *(*»)},
we can conclude that xeSi implies tyy{x) ^ 0. Thus in particular, if xh T is in Si the corresponding column
need not be even generated in order to establish that <//(^') 3= 0.
Another problem is that numerical difficulties can be anticipated when l^l is small, due to the
fact that the Simplex algorithm calls for the solution of a linear system, whose matrix of coefficients
might be nearly singular. Each Simplex iteration consists of two stages:
A. Determination of a candidate vector y by solution of the system
(3.18) J y r ur{xi) = 4>(x*) j= 1, 2, . . ., n,
r=\
where x J belongs to T and corresponding rows are linearly independent. Then we have to find the sign
of il)y(x') for all x'eT, where i// tf is defined by (3.17).
B. If min iM*'h x'eT, is assumed for i = ii, we introduce x" in the next basis and hence we have
to solve systems of the form
(3.19) J ^iUrix 1 ) = ix r r=l,2, . . .,n
m t  > 0.
As remarked in Gustafson [20], the abscissae often lie close together in pairs which will cause the
matrix of coefficients in (3.18) and (3.19) to be illconditioned.
Consider first the problem of determining </>«(*) from (3.18). Let y be the computed value of y
and put
€i =
£?rM*>)<M* J )
;=1, 2, . . ., n.
Let further Aifj y (x) be the error in the value of ty y {x) caused by the fact that we use y instead of y.
Gustafson [22] gives the bound
where
n
^ Pj(x)u r (xJ) : =Ur(x) T=l,2, . . .,rt.
j=l
Put
P(*)=X pr(*).
498 S. A. GUSTAFSON AND K. O. KORTANEK
We note then that p(x) is a continuous function over Sk and that p(xj) =1,7=1,2,. . . , n. Further, p
is completely determined by the system u u u 2 , . . ., u„ and hence independent on the way we perform
our computations.
e r is, however, dependent both on the computer and the manner in which (3.18) is solved. Let £ r
be the value we should obtain if we inserted the exact vector y in (3.18) and evaluated the residuals
computationally. Wilkinson [50, p. 252] states that if (3.18) is solved by means of Gaussian elimination
with pivoting, then e =£ 3e even if the system is illconditioned. Therefore, if we solve (3.18) in
this mode, Ai//y(;t) is as small as possible. However, the ordinary Simplex method does not provide
pivoting when the sequence of linear systems is solved, a fact that may be the cause of the often
observed instability of linear programming codes. In contrast, the variant of BartelsGolubSaunders
[52] holds promise to be more stable.
When we solve (3.19), it is crucial that the sign of m, remains positive.
3.3.5. Construction of an Initial Approximation for N ewtonRaphson Method
We discuss now how to construct an initial approximation for system NL (Definition 2.3), when the
solution of Di — T and Pf — T is known. In this subsection we make the following general assumptions:
Al: u\, u 2 , . . ., u„ and (f> have continuous partial derivatives of the second order.
A2: G is linear
(3.20) G(y) = ^p. r y r .
(If G has continuous partial derivatives, we put
(3.21) C(y)~C(y*) + l ( yr _y*)(£^ ,
where y* is a solution of D F — T.)
A3: The matrix A y {x) given by
««> W%g
is positive definite when x is a zero of i// y , and \\f ti is defined by (3.17).
REMARK: The linearization (3.21) is used when we employ the iterative process described in
the end of subsection 3.2. Assumption 3 entails that the zeroes correspond to strict minima of t// y . A3 is
difficult to verify in advance.
The major problem in finding an approximate solution is to determine the number of masspoints
in an optimal solution of D F when the solution of Dp — T and its primal Pf~T are known. Let y T be an
optimal solution of D F — T and an optimal solution of Py — T be described by the pairs
(3.23) (£', m'), i=l,2, . . ., n.
*Research Report, Department of Computer Sciences, Standford, 1969.
SEMIINFINITE PROGRAMMING PROBLEMS 499
The vector £' gives the location of the mass m { . P F — T may have many optimal solutions, but we can
always take one with n' ^ n.
DEFINITION 3.8: A subset Cj of {£'}£:, is called a cluster if each member of Cj lies at most
3 17* from any other member of Cj, and Cj cannot be expanded by inclusion of more elements of {£'}"' ,.
Thus we divide the set (3.23) uniquely in q clusters where 1 *£ q^n'.
DEFINITION 3.9: Put
n
*T = 5) yr U r~<f>
r= 1
and define the matrix At analogously with A u in (3.22).
Ci is called a pointgroup if A T (x j ) is positive definite and
V«M*J) *s 0.5/M*;), all xUC\
Two cases are possible, namely: case a: all clusters are pointgroups; case b: there is a cluster that is
not a pointgroup.
LEMMA 3.7: Let A\,A2, A3 hold. Then there is a number h' such that if \T\ < h' then all clusters
are pointgroups.
PROOF: Assume the contrary. Using Theorem 3.3, we can then select a sequence 7*i, Tz, ... of
grids such that \Ti\ — * and such that corresponding vectors z(Ti) tend to a vector z while at least
one of the clusters of z(Ti) is not a pointgroup. Let the clusters be Ci(l), C 2 (l), . . ., C q (i)(l). Each
of the clusters contains at most n elements and hence their diameter is less than 3n  Ti \. Hence all the
masscarrying points in a cluster converge toward the same point. Denote these limitpoints by £,,
j= 1, 2, . . ., q. Using Assumption A3, we conclude that there is a 8 > such that if U~ £j < 8»,
y'=l, 2, . . ., q then \\A(x) \\ 5* 4 Vi/»(x)  The convergence of y\, yg, . . ., implies that there is an
Ni such that ^r,(A:) 2 s 2 Vi/»7,, l> Ni for \\x— ![\\ < 8o. In the same manner we establish that there
is an Ni such that l> N 2 implies A T ,(x) is positive definite. Therefore, if / > max (N, N t , N 2 ) then
Cj is a pointgroup, j— 1,2,. . ., q. Hence the sought contradiction is established.
If T is such that the set (3.23) can be subdivided in clusters, all of which are pointgroups, we
construct an initial approximation to system NL as follows:
1. The masses in each pointgroup are combined and allocated in the group's center of gravity.
Hence we take q equal to the number of pointgroups and each pointgroup corresponds to a masspoint.
2. If a masspoint is less than 3 J T" from the boundary, it is moved to the nearest boundarypoint.
3. This so obtained pointmass distribution is taken as first approximation.
4. Equations (2.4) and (2.8) are determined by the distribution of masspoints.
3.3.6 Use of NewtonRophson Methods
Applying the methods described in the preceding section, we obtain an approximate solution of
the system NL.
A solution of this system can be described by a vector z of the general structure
(3.24) z=(nii, x 1 , m 2 , x 2 , . . ., m q , x Q , v + , v~, y),
where the point x* carries the mass nii and y is an optimal solution of D F .
500 S. A. GUSTAFSON AND K. O. KORTANEK
DEFINITION 3.10: Any vector of the general structure (3.24) will be called a trial vector if
/»j>0, i=l,2, . . ., q,vf& 0, r=l, 2, . . ., n and vf^ 0, r=l, 2, . . ., rc.
LEMMA 3.8: Any solution of (2.3)(2.6), (2.8) which is a trial vector, can be used to give a lower
bound of the optimal value of Dp. If also
n
2 y r u r (x) 3* <j>(x), *«£*,
r=l
then y is an optimal solution of Dp.
PROOF: A trial vector which is a solution describes a feasible solution of Pf — T for a certain T. The
conclusions follow, then, from known duality relations. Q.E.D.
We want to generate a sequence of trial vectors which converge toward an optimal solution. We
write the system NL under the general form
W(z) = 0.
Assume a trial vector z* is known. Then we want to construct a correction hi such that
(3.25) \\w(zi+h*)\\< lirG*),
and then put
(3.26) zi +1 = Z i + hi,
if hi can be selected such that z j + hi is a trial vector.
In the classical NewtonRaphson method, we take hi as the solution b j of the system.
N ZW( 7 i\
(3.27) y^^fri+r(zJ) =
where N is the number of components of z. If the matrix in (3.27) is singular, the methods in Ben
Israel [2] can be used.
In OrtegaRheinboldt [40, p. 421], we find general criteria for the convergence of the Newton
Raphson's method, but they cannot, in general, be used. However, if the matrix in (3.27) is regular for
all z } and the condition (3.25) is met, then {zj} converges toward a local minimum of the function
JT(z). The same is true for the modified sequence {zi} obtained by putting hi = \bi where the real
number \ is such that condition (3.25) is met together with the requirement that z j+1 is a trial vector.
The general idea is to generate a sequence of trial vectors until the norm of the residual falls below
a value, prescribed in advance. If this does not happen before a given maximum time has elapsed, the
process is assumed to diverge. Then one can use the last trial vector found to construct better approxi
mations by means of the grid point methods described in earlier sections.
We note that Dp subsumes a large class of different problems. In many important particular
instances, special methods can be used both to simpUfy the computational scheme and to establish
properties of convergence and unicity. We will return to this in later papers.
SEMIINFINITE PROGRAMMING PROBLEMS 501
3.3.7 Remarks on Sensivity
It is well known from the numerical solution of special cases of Df (see, e.g., Example 1.3 in the
introduction) that small changes in input data cause large dislocations of x' and mi, but thai the optimal
value is not affected very much. x x are the locations of minima of the function
£ y r u r  <f>,
r=l
and to determine these is an illconditioned task in one dimension (see, e.g., Wilkinson [49, p. 39])
and this situation cannot be expected to improve when St has several dimensions.
A first order a posteriori approximation of the sensivity can be made if the matrix of (3.27) is regular.
Let/(z) — G(z) and let z be an approximation of a solution z*. Linearizing, we arrive at
df=f(z*)f(J) ~<V/, b>,
where
Mb =  W{1)
with
„„ dWi(z) , , * .
M Xi = and < a, b > denotes > a r b r .
Hence we find 6= \\M~W{z)\\ ^ \\M^\\ \\W{z)  and we get the approximation \\z~z*\\ ~ 6. The
estimate of df can be written in an attractive manner. Using Lemma 2.1 in Gustaison [22J, we get
e(/= — < u, w > where M T u=S7f. Hence an approximate bound on \df\ is given by  < u, w>\.
This concludes our treatment of the computational scheme.
Recently (1973) a definition of cluster is taken independently of grid size. The convergence
theorems are then different than lemma 3.7.
BIBLIOGRAPHY
[1] Baker, George A., Jr. and John L. Gammel, "Applications of the Principle of the Minimum Maxi
mum Modulus to Generalized Moment Problems and Some Remarks on Quantum Field Theory,"
J. Math. Anal, and Appl. 33, 197211 (1971).
[2] BenIsrael, A., "A NewtonRaphson Method for the Solution of Systems of Equations," J. Math.
Anal, and Appl. 15, 243252 (1966).
[3] BenIsrael, A., A. Charnes, and K. O. Kortanek, "Asymptotic Duality Over Closed Convex Sets,"
J. Math. Anal, and Appl. 35 (1971).
[4] Bojanic, R. and R. DeVore, "On Polynomials of Best OneSided Approximation," L'Ensignement
Math. 12, 139164 (1966).
[5] Buck, R. C, "Alternation Theorems for Functions of Several Variables," J. Approx. Theory 1,
325334 (1968).
[6] Charnes, A. and W. W. Cooper, Management models and industrial applications of linear program
ming (J. Wiley and Sons, New York: 1961) Vols. I and II.
502 s A GUSTAFSON AND K. O. KORTANEK
[7] Charnes, A., W. W. Cooper, and K. O. Kortanek, "Duality, Haar Programs and Finite Sequence
Spaces," Proc. Nat. Acad. Sci. U.S., 48, 783786(1962).
[8] Charnes, A., W. W. Cooper, and K. O. Kortanek, "On the Theory of SemiInfinite Programming
and a Generalization of the KuhnTucker Saddle Point Theorem for Arbitrary Convex Functions,"
NRLQ 16, 4151 (1969).
[9] Cheney, W. E., Introduction to approximation theory (McGrawHill, Inc., N.Y., 1966).
[10] Dantzig, G. B., Linear programming and extensions (Princeton University Press, N J., 1963).
[11] DeVore, R., "One Sided Approximations of Functions," J. Approx. Theory I, 1125 (1968).
[12] Duffin, K. J., "Infinite Programs," in Linear Inequalities and Related Systems (ed. by H. W. Kuhn
and A. W. Tucker), Annals of Math. Studies No. 38, Princeton University Press, Princeton,
N.J., pp. 157170(1956).
[13] Duffin, R. J., "An Orthogonality Theorem of Dines Related to Moment Problems and Linear
programming," J. Combinatorial Theory 2, 126 (1967).
[14] Duffin, R. J., "Duality Inequalities of Mathematics and Science," 401423 in {39.
[15] Duffin, R. J. and L. A. Karlovitz, "Formulation of Linear Programs in Analysis I: Approximation
Theory," SIAM Jour. 16, 662675 (1968).
[16J Fan, Ky, "On Systems of Linear Inequalities," in Linear Inequalities and Related Systems (ed.
by H. W. Kuhn and A. W. Tucker), Annals of Math. Studies No. 38, Princeton University Press,
Princeton, N.J. (1956), pp. 99156.
[17] Fan, Ky, "Asymptotic Cones and Duality of Linear Relations," J. Approx. Theory 2, 152159
(1969).
[18] Gorr, W., and K. O. Kortanek, "Numerical Aspects of Pollution Abatement Problems: Constrained
Generalized Moment Techniques," IPP Report No. 12, School of Urban and Public Affairs,
CarnegieMellon University (Oct. 1970).
[19] Gorr, W., S.A. Gustafson, and K. O. Kortanek, "Optimal Control Strategies for Air Quality
Standards and Regulatory Policy," Environment and Planning 4, 183192, (1972).
[20] Gustafson, S.A., "On the Computational Solution of a Class of Generalized Moment Problems,"
SIAM J. Numer. Analysis 7, 343357 (1970).
[21] Gustafson, S.A., "Numerical Aspects of the Moment Problem," Fil.dr. Thesis, Institutionen
for Informations Behandling, Stockholms Universitet, Stockholm, Sweden (Apr. 1970).
[22] Gustafson, S.A., "Control and Estimation of Computational Errors in the Evaluation on Interpola
tion Formulae and Quadrature Rules," Math. Computation 24, 847854 (1970).
[23] Gustafson, S.A. and Germund Dahlquist, "On the Computation of Slowly Convergent Fourier
Integrals," Methoden und Verfahren der Mathematischen Physik 6, 3743 (1972).
[24] Gustafson, S.A. and K. O. Kortanek, "Analytical Properties of Some MultipleSource Urban
Diffusion Models," Environment and Planning, 4, 3141, (1972).
[25] Gustafson, S.A. and K. O. Kortanek, "Mathematical Models for Air Pollution Control: Numerical
Determination of optimizing Abatement Policies" to appear in Models for Environmental
Pollution Control (R. A. Deininger, Ed.), Ann Arbor Science Press, Ann Arbor, Mich.
[26] Gustafson, S.A., K. O. Kortanek, and W. Rom, "NonCebysevian Moment Problems," SIAM J.
Numer. Analysis 7,335342 (1970).
[27] Gustafson, S.A. and J. Martna, "Numerical Treatment of Size Frequency Distributions with
Computer Machine," Geologiska Foreningens Forhandlingar 84, 372389 (1962).
[28] Gustafson, S.A. and W. Rom, "Applications of SemiInfinite Programming to the Computa
SEMIINFINITE PROGRAMMING PROBLEMS 503
tional Solution of Approximation Problems," Tech. Report No. 88, Dept. of Operations Re
search, Cornell University, Ithaca, N.Y. (Sept. 1969).
[29J Haar, A., "Uber lineare ungleiehungen," Acta. Math. (Szeged) 2, 114 (1924).
[30] John, Fritz, "Extremum Problems with Inequalities as Side Conditions, in: Studies and essays,
Courant Anniversary Vol. (ed. K. O. Friedrichs, O. E. Neugebauer, and J. J. Stoker) J. Wiley
and Sons, Inc., New York, pp. 187204 (1948).
[31] Kantorovich, L. V. and G. Sh. Rubinshtein, "Concerning a Functional Space and Some Extremum
Problems," Dokl. Akad. Nauk. SSSR 115, 10581061 (1957).
[32] Karlin, S. and W. J. Studden. Tchebycheff Systems: with Applications in Analysis and Statistics
Interscience Publishers, J. Wiley and Sons, Inc., New York, (1966).
[33] Kelley, J. E., Jr., "The Cutting Plane Method for Solving Convex Programs," J. SIAM 8, 703712
(1960).
[34] Kortanek, K. O. and J. P. Evans, "PseudoConcave Programming and Lagrange Regularity,"
Operations Research 75, 882891 (1967).
[35] Krafft, Olaf, "Programming Methods in Statistics and Probability Theory," 425446 in [39].
[36] Kretschmer, K. S., "Programmes in Paired Spaces," Can. J. Math. 13, 221238 (1961).
[37] Kretschmer, K. S., "Linear Programming in Locally Convex Spaces and Its Use in Analysis,"
Ph.D. Thesis, CarnegieMellon University, Pittsburgh, Pa. (1958).
[38] Meinardus, Giinter, Approximations of functions: theory and numerical methods (Springer
Verlag, New York, Inc., 1967).
[39] Nonlinear programming (ed. J. B. Rosen, O. L. Mangasarian, and K. Ritter) (Academic Press,
New York, 1970).
[40] Ortega, J. M. and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables
(Academic Press, New York and London, 1970).
[41] Powell, M. J. D., "On the Maximum Errors of Polynomial Approximation Defined by Interpola
tion and by Least Square Criteria," Comp J. 9, 404407 (1966).
[42] Rivlin, T. J. and H. S. Shapiro, "A unified Approach to Certain Problems of Approximation and
Minimization," SIAM J. Appl. Math. 9, 670699 (1961).
[43] Rubinshtein, G. Sh., "Investigations on Dual Extremal Problems," Doctoral Dissertation, Inst.
Matem. SO AN SSSR, Novosibirsk (1965).
[44] Shapiro, H. S., "On a Class of Extremal Problems for Polynomials in the Unit Circle," Portugaliae
Math. 20, 6793 (1961).
[45] Shohat, J. A. and J. D. Tamarkin, "The Problem of Moments," Mathematical Surveys, No. 1,
Am. Math. Soc, New York (1943).
[46] Stiefel, E., "Note on Jordan Elimination, Linear Programming and Tchebycheff Approximation,"
Numer. Math. 2, 117 (1960).
[47] Todd, J., A survey of numerical analysis (McGrawHill, New York, 1962).
[48] Vershik, A. M. and V. TemePt, "Some Questions Concerning the Approximation of the Optimal
Value of InfiniteDimensional Problems in Linear Programming," Sibirskii Matematicheskii
Zhurnal 9, 591601 (1968).
[49] Wilkinson, J. H., Rounding Errors in Algebraic Processes (PrenticeHall, Inc., Englewood Cliffs,
N.J., 1963).
[50] Wilkinson, J. H., The Algebraic Eigenvalue Problem (Clarendon Press, Oxford, 1965).
504 S. A. GUSTAFSON AND K. O. KORTANEK
[51] Wolfe, Philip, "Accelerating the Cutting Plane Method for Nonlinear Programming," J. Soc.
Indust. Appl. Math. 9, 481488 (1961).
[52] Bartels, R. H., G. H. Golub, and M. A. Saunders "Numerical Techniques in Mathematical Pro
gramming" in Nonlinear Programming (ed. by J. B. Rosen, O. L. Mangasavian and K. Ritter),
Academic Press, New York, pp. 123176 (1970).
MIN/MAX BOUNDS FOR DYNAMIC NETWORK FLOWS
W. L. Wilkinson
The George Washington University
ABSTRACT
This paper presents an algorithm for determining the upper and lower bounds for arc
flows in a maximal dynamic flow solution. The procedure is basically an extended applica
tion of the FordFulkerson dynamic flow algorithm which also solves the minimal cost flow
problem. A simple example is included. The presence of bounded optimal arc flows entertains
the notion that one can pick a particular solution which is preferable by secondary criteria.
I. INTRODUCTION
Ford and Fulkerson [1] introduced the notion of maximal dynamic flows in networks and provided
an ingenious algorithm for solving the dynamic linear programming problem. A dynamic network
consists of arcs and nodes with two nonnegative integers associated with each arc. One of the integers
defines the capacity of the arc and the other the time required to traverse the arc. There are two dis
tinguished nodes in the network, one for the source where all flows originate and one for the sink
where all flows terminate. If at each node the commodity can either be transshipped immediately or
held over for later shipment, what is the maximal amount of commodity flow from source to sink in
a specified number of time periods? Solutions constructed by the FordFulkerson algorithm have the
attractive property of being presented as a relatively small number of activities (chain flows which
represent a shipping schedule) which are repeated over and over (temporal repetition) until the end of
the allotted time span. A consequence of this temporal repetition is that a single arc flow value in each
arc represents an optimal solution independent of how these arc flows are decomposed into chain
flows. In networks of operational interest, these optimal arc flow values frequently have an upper
bound different from the lower bound. These bounds say that you can always find an optimal chain
flow solution which lies on or between the stated range and no optimal solution lies outside these bounds.
The procedure set forth in the sequel calculates these boundary values for each arc.
As shown in [2], the FordFulkerson dynamic flow algorithm also solves the minimal cost flow
problem. In this problem, roughly described, we are given a network having one or more sources and
one or more sinks with availabilities at the sources and requirements at the sinks. There are inter
mediate nodes between the sources and sinks with connecting arcs having assigned capacities and
unit shipping costs. The problem is to construct a feasible flow, if one exists, which minimizes cost in
satisfying the requirements within the given availabilities. Similarly to the dynamic flow problem, the
bounds on optimal arc flows will indicate the variety of ways, if any, in which such a feasible flow can
be constructed.
Before describing the computing procedure for bounded flows, we will give a more formal statement
of the dynamic flow problem referred to above.
505
506 W. L. WILKINSON
II. MAXIMAL DYNAMIC FLOWS
Given the network G— [N; A] with source 5 and sink t of the node set N, we let nonnegative
integers c(x, y) and a(x, y) be the capacity and traversal time, respectively, of each arc (x, y) in the
arc set A. Let/(x, y; t) be the amount of flow that leaves x along (x, y) at time t, consequently arriving
at y at T+a(x, y). Also f(x,x;r) is the holdover at x from r tor+ 1. If V(P) is the net flow leavings or
entering t during the P periods to 1, 1 to 2, . . ., P — 1 to P, then the problem may be stated as the
linear program:
subject to
Maximize V(P),
2 I 1705, T, t) /(y, s; ra(y, s))]V(P) = 0,
7 = y
^[f(x,y,T)f{y,x;Ta(y,x))]=0, x¥^s,t; t = 0, 1, ...,P,
T=0 y
=£/(*, t,t) ^c(x,y).
Here a(x, x) = 1, c(x, x) = °° for holdovers at node x. If /(x, y; t) and F(P) satisfy the above
constraints, we call /a dynamic flow from s to t (for P periods) and say that /has value V(P). If also
V(P) is maximal, then /is a maximal dynamic flow.
III. BOUNDED FLOW ALGORITHM
To initiate the bounded arc flow computations, a maximal dynamic flow solution is required. The
FordFulkerson algorithm is used to obtain such a solution. We set forth their algorithm here, using the
notation of [2], as Routine I in the interests of a coherent presentation for the convenience of the
reader.
Routine II is an application of Kirchhoffs first general law on the conservation of flow at any node
in a network. Routine II calculates "slack bounds"; slack in the sense that flow values contained by
these bounds may not all be optimal, however, no optimal arc flow values are excluded by the bounds.
Routine II is very efficient in calculating bounds based on local information at a node and is retained
for that reason.
Routine HI tightens up these bounds to their true value where necessary. Routine III is, essentially,
an application of Routine I (FordFulkerson algorithm) to a subnetwork composed of a special set of
admissible arcs from the original network. Using the original flow solution and the flow boundaries
from Routine II, optimal flows are circulated through the admissible arcs about a selected arc being
scanned. Treating the initial and terminal nodes of the arc being scanned as a temporary source and
sink, optimal flows are maximized and minimized in the arc thereby determining the absolute upper
and lower optimal arc flow boundaries. If at any time the varying optimal flow values in the admissible
, BOUNDS FOR NETWORK FLOWS 507
arcs are observed to reach a bound of Routine II, that bound has been verified since a known solution
lies on a bound which is suspect of being too loose. Consequently, several unscanned arcs may get
scanned in the process of scanning a particular arc. The reader may note that Routine HI, slightly
revised, could compute the true bounds without the aid of Routine II. Experience has shown that
retaining the services of Routine II saves a substantial amount of computation time.
ROUTINE I
FordFulkerson Algorithm
Initial Conditions
1. Establish P, the time span of interest.
2. Set node numbers n(x) = for all x.
3. Define a{x, y) = ir{x) + a(x, y)n(y) and consider as an admissible arc any (*, y) where
a(x, y)=0.
4. Set all /U, y)=0.
5. During the routine a node is in one of the following states:
Unlabeled and unscanned,
Labeled but unscanned, or
Labeled and scanned.
6. All nodes are unlabeled.
Arc Flow Generating Routine
STEP 1. To node s assign the label [+t, °°] and consider node s as unscanned.
STEP 2. Take any labeled, unscanned node x and suppose that it is labeled [±u>, A]. To all nodes
y that are unlabeled and such that:
a. (x, y) is admissible and/(x, y) < c(x, y), assign the label [+x, min (h, c(x, y)—f(x, y))], or if
b. (y, x) is admissible and/(y, x) >0, assign the label [— x, min (h,f(y, x))].
Consider node x as scanned and any newly labeled ynodes as unscanned. Repeat until:
a. node t is labeled (breakthrough), or
b. no new labels are possible and node t is unlabeled (nonbreakthrough).
STEP 3. If breakthrough results and suppose node t is labeled [+y, h], replace /(y, t) by
/(y» t)+h. Next turn attention to node y. In general, if y is labeled [+x, m], replace /(x, y)
by f(x, y) + h, or if y is labeled [— x, m], replace/(y, x) by/(y, x) — h. Next turn attention to node x.
Ultimately the node s is reached; at this point stop the replacement process. Starting with the new
flows thus generated, discard the old labels and repeat the above Arc Flow Generating Routine until no
new labels are possible and node t cannot be labeled. When this condition results, proceed with the
following NonBreakthrough Processing Routine.
NonBreakthrough Processing
STEP 1. Calculate a value of 8 as follows: Define
Ai = {(x,y) \xiX, yeX, d(x, y) > 0},
508 W. L. WILKINSON
^2= {(x, y) \xeX, yeX, d(x, y) < 0},
where X is the subset of labeled nodes and X is the complementary subset of unlabeled nodes. Let
8, = min [d(x, y)], or P + 1 — jr(t) if A x = <f>,
8 2 = min [—a (x, y ) ] , or P + 1 — ir(t) if A 2 = (f>.
(x,y)(A 2 i t <t>
Then
8= min (8i, 8 2 ).
Now define for all x, new node numbers it' (x) as
'flr(x) if* is labeled.
min [ir(x) + 8; 7r(x) + P 4 1 — 7r(t)], if oc is unlabeled.
After new node numbers have been assigned, consider tt' (x) as ir(x).
STEP 2. If 7r(0 < P + 1, return to the Arc Flow Generating Routine. If ir(t) = P+l, algorithm
terminates.
ROUTINE II
Slack Arc Flow Bounds
Initial Conditions
1. A maximal dynamic flow solution has been obtained for a particular P. Retain d(x, y) and
f(x,y) for all (x,y). Retain tt(x) for all x.
2. Set for all (x, y) :
'0/0, if d(x,y) > 0,
G(x, y)lg(x, y) ={ c(x, y)/0, if d(x, y) = 0,
c(x, y)/c(x, y), if d(x, y) < 0.
3. Add an arc (t, s) and set G(t, s) = g(t, s) = 2 /( 5 » y)«
y
4. Order all nodes in increasing n(x) sequence with no preference where equality exists.
5. All nodes are unscanned.
6. If d(x, y) — 0, then (x, y) is an admissible arc.
Procedure
STEP 1. Take the lowest ordered, unscanned node x and to all admissible arcs calculate and
insert into the arc record
BOUNDS FOR NETWORK HOWS 509
(la) G'(x, y) = min[G(x, y); £ G(i, x) ^g(xj) + g(x, y) ,
(lb)
g'(x, y) = max[g(x, y); Y g(i, x) J G(x, j) + G(x, y)\.
If G' (x, y) < G(x, y) or g' (x, y) > g(x, y), consider y as unscanned.* Now consider the newly assigned
G' (x, y) andg' (x,y) asG(x,y) and g(x, y), respectively. When all admissible arcs have been examined,
consider x as scanned, proceed to the next lowest ordered unscanned node and scan that node. Scan
all nodes.
STEP 2. When all nodes have been scanned, then consider all nodes as unscanned. Take the
highest ordered, unscanned node y and to all admissible arcs calculate and insert in the arc record
(2a) C'U,y) = min[G(;t,y); £ G(y,j) ^g(i, y) +g(x, y)],
(2b) g'(x,y) = m a x[g(x,y); ^g(y,j) %G(i,y) +G(x,y)\.
If G' (x, y) < G{x, y) ox g' {x, y) > g(x, y), consider* as unscanned. Now consider the newly assigned
G' (x, y) and g' (x, y) as G(x, y) and g(x, y), respectively. When all admissible arcs have been exam
ined, consider y as scanned, proceed to the next highest ordered unscanned node and scan that node.
Scan all nodes.
STEP 3. When all nodes have been scanned, then consider all nodes as unscanned. Take the lowest
ordered, unscanned node x and to all admissible arcs calculate and insert in the arc record the results
of Equation (1). Consider x as scanned. If G' (x, y) < G(x, y) or g' (x, y) > g(x, y) consider y as un
scanned. Proceed to the next lowest ordered, unscanned node and scan that node. Scan all nodes. This
terminates Routine II. Go to Routine III.
NOTE: As described above, Routine II sweeps from source to sink, sink to source and then source
to sink. Computational experience has indicated that three sweeps achieve the best compromise
between best bounds and reasonable computing times. One could specify repetitive sweeps until a
complete sweep had been made with no changes to G(x, y)lg(x, y). Alternatively, premature termi
nation is allowed at any point since we are only seeking approximations to the true G(x, y)lg(x, y).
ROUTINE III
Taut Arc Flow Bounds
Initial Conditions
1. Label all arcs as follows:
a. "Gg" it f(x, y) = G(x, y) = g(x, y),
b. "C+" if/(x, y) = G(x, y) * g(x, y),
c. "+g" if/(x, y) = g(x, y) ¥= G(x, y), or
d. "++" if none of the above is true.
The scan state of y may be either scanned or unscanned. If neither condition is met, then this state remains unchanged.
This potential redundancy is necessary to accommodate the a(x, y) =0 instances where tt(x) = n{y), thus both arcs, (x, y) and
(y, x) may be admissible. The stated inequalities prevent looping.
510 W. L. WILKINSON
2. Mark all arcs labeled "Gg" and consider them scanned.
3. An unmarked arc is an admissible arc.
4. All nodes are unlabeled and unscanned.
5. If all arcs are scanned, terminate the routine. Otherwise, go to the procedure below.
Procedure
STEP 1. Take any unscanned arc (x, y) and consider x as s' and y as t ' .
a. If {$', t ') is labeled "+#" go to STEP (2) below and omit STEP (3) below.
b. If (*', t') is labeled "G+" go to STEP (3) below.
c. If (s\ t') is labeled "++" go to STEPS (2) and (3) below.
STEP 2. To node t' assign the label [+$', G(s' , t') —f(s', t')]. Take a labeled unscanned node x,
initially t' is the only such node, and suppose it is labeled [±w, h], to all nodes y that are unlabeled and
a. (x, y) is admissible and f(x,y) < G(x,y), assign the label [+ x, m\n(h, G(x,y) — f(x,y))],
or if
b. (y, x) is admissible and/(y, x) > g(y, x), assign the label [— x, min(/»,/(y, x) —g(y, at))].
Consider node x as scanned. Repeat until node 5' is labeled (breakthrough) or no new labels are possible
and node 5' is unlabeled (nonbreakthrough). If breakthrough results and node s' is labeled [+y, h],
replace /(y, 5') by/(y, 5') + h or if nodes' is labeled [— y, A], replace / (s ', y) by/(s',y) — h. In either
case, if the arc (y, s') or (s', y) was previously considered scanned, it remains scanned bearing the
label "Gg." If not previously scanned, consider the following cases.
a. Node s' is labeled [+y, h]. If the new f(y, s') = G(y, s') and the current label is "+#,"
relabel the arc "Gg" and consider it scanned; or if the current label is "++," relabel the arc "Gf"
and consider it unscanned. If the new/(y, 5') < G(y, s'), the arc retains its current label and remains
unscanned.
b. Node s' is labeled [ — y, h]. If the new/(s', y)=g(s', y) and the current label is "Gf," re
label the arc "Gg" and consider it scanned; or if the current label is "++," relabel the arc "+g" and
consider it unscanned. If the new f(s'y) >g(s'y), the arc retains its current label and remains
unscanned.
Next, turn attention to node y and repeat the replacement and labeling process as for (y, s') or (s' , y),
incrementing or decrementing the flow value by h and determining whether the arc is to be considered
scanned or unscanned. Continue this replacement process until a reverse path to node 5' has been
traced out. At this point, stop the replacement process. If /(«', t')=G(s', t') a condition for non
breakthrough exists. If not, then starting with the new flows thus generated, discard the old node labels,
consider all nodes as unscanned and repeat this Procedure until nonbreakthrough results. When non
breakthrough results, record G(s' , t') =f(s', t'). Erase all node labels, consider all nodes as unscanned
and proceed to STEP 3 below if lb is satisfied. Otherwise, go to STEP 4.
STEP 3. Do STEP 2 except assign the initial label to node s' as [—t',f(s', t')—g(s\ t')]> s t0 P
the labeling and replacement breakthrough process at t' and, on nonbreakthrough, record g(s' , t')
= f(s',t'). Then go to STEP 4 below.
STEP 4. Consider (5', t') as scanned and reverted to its original (x, y) designation. Erase all node
labels and consider all nodes as unscanned. Take the next unscanned arc (x, y) and repeat all of the
above Procedure. If no such arc exists, terminate the routine. G(x, y) and g(x, y) are the firm upper and
lower bounds, respectively, for optimal arc flows.
BOUNDS FOR NETWORK FLOWS 511
IV. PROOF OF EXTREME ARC FLOW VALUES
We stated earlier that the Bounded Flow Routine II has proven useful in reducing computing
times. Since Routine III operates on the slack bounds produced by Routine II, it is necessary to show
that optimal arc flow values are never excluded by Routine II. We will now prepare the way for stating
such a theorem.
LEMMA 1: The sequence of equation pairs for G' (x, y) and g' (x, y) in the Bounded Flow Routine
II will produce monotonic nonincreasing values of G' (x, y) and monotonic nondecreasing values of
g'(x, y) for all (x, y).
PROOF: The truth of the assertion is a natural consequence of the structure of Equations (1) and
(2) for G" (x, y) andg'U, y) where the upper limit for G' (x,y) is G(x,y) and the lower bound for g'(x, y)
is g{x, y) . Independent of the number of iterations of the equation pairs, the previously calculated value
for G(x, y) or g(x, y) provides a bound consistent with the assertion above.
LEMMA 2: If G(x, y) 3= g(x,y) for all (x, y), Equations (1) and (2) will maintain G' (x, y) 3= g'(x,y)
for all (x, y).
PROOF: Consider the pair of Equations (1) for G' (x, y) and g' (x, y). The cases of interest are
where:
(1) G'(x,y) = ^G(i,x)^g(x,j)+g(x,y)<G(x,y),
i J
and
g'(x, y) = Yg(i, x) yG(x,j)+G(x, y) > g(x, y).
(2) r f
Ignoring the inequalities and subtracting the second equation from the first, we have
G' (x, y)g' (x, y) = ^[G(i, x)g(i, x)]+ £ [G(x,j)g(x,j)l
By hypothesis, each of the two summation terms on the right side is nonnegative, therefore, the dif
ference on the left side is nonnegative.
Equations (2) are symmetric with (1) for calculations progressing from sink to source. A similar
exercise to the above would produce equivalent results.
That the hypothesis is true at the outset can be seen in the Initial Conditions established by
Condition No. 2. Recall that we insisted that c(x, y) 3=0.
LEMMA 3: When scanning a node x, the equations of Routine II always calculate ^G(x,j) 3=
5>(t\*).
PROOF: Consider Equation (lb) which is
g'(x, y) = max[g(*, y); ^g(i, x) ^G(x, j) + G(x, y)],
i J
or
g'(x,y)G(x,y)^%g(i,x)^G(x,j).
512 W. L. WILKINSON
We restate the above inequality by changing signs. Now,
G(x, y) g'(x, y) ^G{x, j)^g{i, x).
i i
Lemma 2 states that G' (x, y) ^ g'(x, y) and Lemma 1 assures us that G(x, y) 3= G' {x, y). Therefore,
the left side is nonnegative. Consequently, ^G(x,j) 3= ^g(i, x).
5 i
Analogous arguments to those above for Lemma 3 would prove its corollaries for the following:
^C(i,jc) 3= ^g(x,j) by considering Equation (la),
' j
^G(y,j) ^^g(i, y) by considering Equation (2a), and
i i
^ G(i, y) 3= ^g(y, j) by considering Equation (2b).
' j
THEOREM 1: Let G(x,y) and g(x, y) be the upper and lower bounds, respectively, as determined
by the Bounded Flow Routine II for all arcs (x, y) in G = [N; A ]. Then any temporally repeated maximal
dynamic flow solution for this network will contain for all (x, y) a flow/(x, y) such that
g(x, y) *£/(*, y) *£ G(x, y).
PROOF: Required flows are initially set in the inadmissible arcs according to the optimality
criteria, i.e., for those arcs where a(x, y) < 0, the minimum flow is equal to the arc capacity and for
those arcs where d(x, y) >0, the maximum flow is equal to zero. Where d(x, y) =0, the flow may be
anywhere between zero and the arc capacity. In the return arc from sink to source, we have set the
minimum and maximum equal to the maximal static flow for the system for the time period of interest.
These values do not change at any time. Initially for the admissible arcs the maximum flow is set at the
arc capacity and minimum flow at zero which are the broadest possible bounds since we insist that
< f(x, y) *£ c(x, y) for all (x, y) . Clearly then, our theorem holds for the initial conditions.
Lemma 1 tells us that G(x, y) never increases, but tends to decrease and g(x, y) never decreases,
but tends to increase while Lemma 2 maintains that G(x, y) 3 s g(x, y) for all (x, y) at all times. Lemma 3
and its corollaries insure that the available flow into a node is at least equal to the required flow out of a
node and, conversely, the required flow into a node is no greater than the available flow out of a node for
all nodes at all times. Since initially the feasibility of meeting local conditions of optimality exists
everywhere, we will proceed by induction on the sequence of nodes to be scanned, i.e.,
s, Xi, X2, . . ., t, where Tr(xi) *£ Tr(xi+i).
Where equality holds between two or more nodes and the connecting arcs have zero traversal times,
there may be redundant scanning of any of the nodes, but Lemmas 1 through 3 will hold nonetheless
for each scan.
BOUNDS FOR NETWORK FLOWS 513
For the initial scanning of 5, we know that G(s, x) and g(s, x) are all valid in the sense of the
theorem. Therefore the cases of interest are where G' (s, x) < G(s, x) and g' (s, x) > g{s, x). Accord
ingly,
G'(s,x) = G(t,s)^g(s,j),
j * x
or
G'(s,x)+^g(sJ) = G(t,s).
j*x
Suppose we substitute
then
f(s,x)>G'(s,x) for G'(s, x).
f(s,x)+^g(s,j)>G(t,s),
j*x
a contradiction to conservation of flow at 5 and the relative flow conditions maintained by Lemma 3.
Similarly,
g'(s,x)+"2G(sJ)=g(t,s),
and we substitute /(s, x) < g' (s, x) for g' (s, x). Then
As,x)+2G(s,j)<g(t,s),
jftX
again a contradiction as above. Thus we see that G' (s, x) and g' (s, x) are valid upper and lower bounds,
respectively, for all x, whether or not they have been changed from their initial values.
Next consider X\. If ^ G(i, xi) or V g(i, *i) are no longer their original values, we know from the
i i
above that they are valid. Again, the equation of interest is
G'(x u y)+ J) g(x u j) = ^G(i, xx),
j* y i
and a substitution f{x\, y) > G' (x u y) constitutes a contradiction. Similarly,
*'(*., y)+ 2 C(*„y) = 2*(».*i).
j * V i
A substitution f(x\, y)<g'(x\, y) produces a contradiction and G'{x u y) and g'(xi, y) are
validated.
514 BOUNDS FOR NETWORK FLOWS
As we proceed, we see that this holds for any Xj for the calculations are based on ^ G(i, Xj) and
i
2 g(i, Xj) which, although possibly changed from their initial values, are known to be valid in the
i
sense of the theorem.
At the pivot node t , we reverse our sequence and proceed to 5. Lemmas 2 and 3 insure that ultra
conservative conditions hold at this crucial point and a parallel argument to the s to t sequence would
produce equivalent results. One may make as many iterative passes from s to t and t to s as desired
without violating the conditions asserted by the theorem. This concludes our proof.
We now turn our attention to the Bounded Flow Routine III. The scanning process of Routine HI
is basically an application of the FordFulkerson (FF) algorithm (Routine I) as it operates between
nonbreakthroughs. The maintenance of conservation of flow at every node and the maximizing proper
ties of this algorithm, are well established [2]. Our proof for Routine III then reduces to showing that
there exists a formal equivalence between our application and the standard conditions for the FF
algorithm.
Consider STEP 2 which is concerned with determining the best G{x, y). G(x, y) has replaced
c(x, y) as the upper bound and g(x, y) has replaced zero as the lower bound. Justification for these
substitutions can be found in Theorem 1. Our new network has all the arcs and nodes of the old except
the arc being scanned. As in the FF algorithm, only those arcs where a(x, y) =0 are admissible for
labeling purposes because under the optimality criteria, it is only in these arcs that the flow can be
altered. We start with a feasible flow as does the FF algorithm. The source for this network is t' and the
sink is 5'. The source gets labeled with the maximum amount of flow that can be augmented in (5', t') ,
i.e., G(x, y) —f(x, y) by Theorem 1. In the FF algorithm, this is taken to be °° for it is not known
a priori what the maximum amount of flow augmentation is. The labeling rules are the same as are the
rules for breakthrough and nonbreakthrough. There is a distinction in the replacement process where
the flow is incremented or decremented in a sequence of arcs which go from sink to sink. However,
the last arc in this sequence is the arc being scanned which is not a part of the current network. When
the routine reaches nonbreakthrough, further flow augmentation is impossible and we have the maxi
mum flow in the arc being scanned.
The argument for STEP 3 follows that for STEP 2 above. The source for this network is s' and the
sink is t ' . The source gets a negative label with the maximum amount the flow in (s', t ') can be reduced,
i.e. ,f(s', t') —g(s', t') by Theorem 1. On nonbreakthrough, the flow in (s' , t') will have been decre
mented to its absolute minimum with respect to optimality and we record the final g(x, y) =f(x, y).
We can now state the following theorem.
THEOREM 2: Let G(x, y) and g(x, y) be the upper and lower bounds, respectively, as determined
by the Bounded Flow Routine III for all arcs ( x , y ) in G = [ N; A ]. Then the integers n, such that
g(x, y) ^ n =£ G(x, y)
provide an exhaustive set of valid arc flows for which there exists an integer, temporally repeated,
maximal dynamic flow solution for G[N; A],
V. AN EXAMPLE
In Figure 1 is shown a simple network and a dynamic flow solution for the stabilization time of
P= 15. Following stabilization time, the static flow of 10 is repeated each time period and new solutions
BOUNDS FOR NETWORK FLOWS
515
Figure 1.
are not necessary since the arc flows do not change in value. The small lower numbers in the nodes
are the node names and the larger upper numbers are the node numbers tt(x) 's. The data in the arc
boxes are the following: Upper left— capacity; upper right — transit time; and the lower number is the
arc flow/(x, y). Capacities and transit times are symmetric, e.g., c(x, y) = c(y, x).
In Figure 2 is shown the bounded arc flow solution based on the flow solution in Figure 1. The net
work data is the same as Figure 1 except the lower numbers in the arc boxes are the upper/lower bounds
G(x,y)lg(x,y).
In decomposing all alternative routes for their maximum optimal flow, we get the following set of
nine routes.
Figure 2.
516
W. L. WILKINSON
Possible chain
Time length
P=15 use
Max flow
0256
14
2
6
0146
13
3
4
01456
15
1
4
02146
12
4
4
01346
13
3
4
021456
14
2
5
013456
15
1
4
021346
12
4
4
0213456
14
2
5
These alternative routes offer a fair variety of different ways in scheduling a particular optimal solution.
For example, we list below two different solutions for contrast. Here, again, the time span is 15.
Solution A Solution B
Chain
Flow
Use
Dynamic
flow
0146
0256
4
6
3
2
12
12
24
Chain
Flow
Use
Dynamic
flow
01456
02146
0256
4
4
2
1
4
2
4
16
4
24
Consider, for instance, that Nodes 1 and 2 are origins and Nodes 4 and 5 are destinations. Then if
there was some preference, not formally stated, for maximizing the origindestination deliveries 14
and 25, one would choose Solution A. However, if the preferred pairings were 15 and 24, Solution B
is the best.
VI. EXPERIENCE
Some version of the Bounded Flow Algorithm has been in use at The George Washington University
since late 1967. The algorithm and the associated computer codes have been revised several times with
the objective of increasing their efficiency. Currently, the bounded arc flow computation takes approxi
mately one minute on a 500 arc network. The program is written in PL/1 for an IBM 360/50.
VII. ACKNOWLEDGEMENTS
The research was conducted as part of the Program in Logistics of the Institute for Management
Science and Engineering, The George Washington University. The work was supported by the Office of
Naval Research.
Special recognition is due to Donald J. Hunt of the Program in Logistics who, since the very begin
ning of this development, has made material contributions to the power and efficiency of the computa
tional procedures. He is solely responsible for the large gains achieved in decreased running time which
followed the original implementation of the algorithm.
Thanks are also due to Raymond W. Lewis of the Program in Logistics for his valuable observations
and suggestions during the various levels of algorithmic development.
REFERENCES
[1] Ford, L. R., Jr. and D. R. Fulkerson, "Constructing Maximal Dynamic Flows from Static Flows,"
Operations Research 6, 419433 (1958).
[2] Ford, L. R., Jr., and D. R. Fulkerson, Flows in Networks (Princeton University Press, 1962).
PRODUCTIONALLOCATION SCHEDULING AND CAPACITY
EXPANSION USING NETWORK FLOWS UNDER UNCERTAINTY
Juan Prawda
Tulane University
New Orleans, Louisiana
ABSTRACT
This paper extends Connors and Zangwill's work in network flows under uncertainty to
the convex costs case. In this paper the extended network flow under uncertainty algorithm
is applied to compute /Vperiod production and delivery schedules of a single commodity
in a twoechelon productioninventory system with convex costs and low demand items. Given
an initial production capacity for N periods, the optimal production and delivery schedules
for the entire N periods are characterized by the flows through paths of minimal expected
discounted cost in the network.
As a byproduct of this algorithm the multiperiod stochastic version of the parametric
budget problem for the twoechelon productioninventory system is solved.
1. INTRODUCTION
In a recent paper Connors and Zangwill [7] developed the Network Flow Under Uncertainty
(NFUU) or rnetworks by allowing the requirements or availabilities at the nodes of a network to be
discrete random variables with known probability distributions. They extended the standard deter
ministic multistage network flow problem introduced by Ford and Fulkerson [11]. The underlying
structure of network flow problems was exploited in Ref. [7] to produce both a new structure which is
not a deterministic network, but maintains many of its properties and a new node which replicates
flows instead of conserving them. They called the former rnetworks and the latter rnodes. Construction
of NFUU from a given N period stochastic problem is not given in this paper. The reader is referred
to [7]. Given the Wperiod problem and convex objective criteria, we will develop an algorithm to cal
culate the network flow problem that minimizes expected cost.
Two applications of this algorithm are given:
1) To compute optimal /Vperiod production and delivery schedules of a single commodity in a
twoechelon productioninventory system with convex costs and low demand items; and,
2) To solve the parametricbudgetary problem in the multiperiod stochastic case, corresponding to
the system described in (1).
This paper is organized as follows: In section 2 the Convex Network Flow Under Uncertainty
Algorithm is stated, its validity and convergency proven; in section 3 the A^period, twoechelon pro
duction and delivery inventory problem is stated. In section 4 we extend the parametric budgetary
problem to the multiperiod, stochastic case, for the system considered in section 3.
2. THE CONVEX NETWORK FLOW UNDER UNCERTAINTY ALGORITHM
Let G— (N, A) denote a Network Flow Under Uncertainty (NFUU), where N is a finite collection
of elements x, y, . . . and A is a finite subset of ordered pairs (x, y) of elements taken from N. N is
517
518 J. PRAWDA
supposed to be of the form N=N t U N 2 U 7V 3 with N t D Nj=<b for i, j= 1, 2, 3, i 4=;'. The elements of
^Vi are called nodes, the elements n, r 2 , . . . of N% are called replication nodes or rnodes and the
elements C\, c 2 , . . . of Na are called collating nodes or cnodes. Members of A are referred to as arcs.
All arcs will be supposed to be of the form (x, y) with x =t= y, x, y in N. We exclude arcs (x, y) where
both x, y are in Ni, i = 2, 3 and arcs going from rnodes to cnodes and vice versa.
If x is in N, we let a(x) ("after x") denote the set of all y in N for which (x, y) is in A, that is,
a(x) = {yeN\ (x,y)eA}.
Similarly, we let b(x) ("before x") denote the set of all y in N for which (y, x) is in A, that is
b(x) = {yeN\(y,x)eA}.
Given G, each arc (x, y) in A has associated with it a nonnegative real number q(x, y) , called the
capacity of the arc {x, y) in A; a nonnegative integer/(jc, y) called the flow of the arc (x, y) in A;
and a nonnegative real number g(x, y), called the expected discounted cost of (x, y) in A. Both/ and
g are functions from A to the nonnegative reals, the former having nonnegative integers as its range.
Let 5, called the source, and t, called the sink, be two distinguished elements of TV.
Each rk in N% has a single input arc and several output arcs and possesses the following two
properties:
(1) f(x, rk) — /(/"k, y) for all y in a(/>) and some x in 6(/>)
and
(2) g(x, rk) 3* 0, g(rk, y) = for all y in a(rk) and some x in b(rk).
Property (1) merely states that flow on each of the output arcs of an rnode must be identical with that
on the input arc, and (2) states that all the outgoing arcs of an rnode have an expected discounted cost
of zero. Each c* in A^ is essentially the negative of an rnode, that is, it has several input arcs and a single
output arc and possesses the following two properties:
(3) f(y, Ot) = f(ck, x) for all y in b{ck) and some x in a(ck)
and
(4) g(ck, x) ^0, g(y, Ck) =0 for all y in b(c k ) and some x in a(c k ).
Properties (3) and (4) are, respectively, the negative of (1) and (2).
It is shown in [7] that
(5) G=UG 4 U^ 2 UiV3,
i=l
where G l — (N(, A 1 ), i=l, . . ., M, M is a finite integer and G' D Gi = <b for i,j=l, . . ., M, i±j.
NETWORK FLOWS UNDER UNCERTAINTY 519
Each G l , i=l, . . . , M is called a subnetwork and it is an ordinary network consisting of ordinary
nodes x', y', . . . in N} and ordinary arcs (x } , y') in A' where x', y' are in N{. In each subnetwork
G l the total inflow equals the total outflow, and the flow is conserved at each node in /V,', (i= 1, . . .,M).
Two subnetworks, say G' and G>, i,j= 1, . . ., M, i =¥j, are connected if there exists at least one
rk in N2 or Ck in N 3 for which (x i , r k ) and (a, yj) are in A when x'eN{ and ycA^j or (aj j , Ck) and (c^, y')
are in A when * j e/V/ and y'eNl. Let
yVj= {ae^ 2  (r A , x l ) or U\ r fc )e4, Ar j e/Vj}
7Vj= {cAeA^a  (c fr , ac') or (*■', c fc )e4, Jc'eA^}}
be the sets of r and c nodes that connect subnetwork i (i=l, . . ., Af) with the rest of the NFUO.
From the NFUU G= (TV; A) (Figure 1) we observe that Wi = { 1,2, . . ., 17, 18}, N 2 = {n, r 2 , r 3 },
A f 3={c,, c 2 , c 3 }, and M = 8.
Figure 1. A network flow under uncertainty or rnetwork.
The algorithm to follow is based on the works of Connors and Zangwill [7] and Hu [14]. It is very
closely related to the works of Beale [2], Busacker and Gowen [4], Hu [15], and Zangwill [24]. This
algorithm, utilizing the decomposition (5), iterates by determining shortest routes or routes of mini
mal expected discounted cost along the subnetworks (which are ordinary networks) forcing one unit
flow on this route and making appropriate adjustments for the r and c nodes.
520 J. PRAWDA
Let h[f(x, y)] be a nonnegative convex function onf(x, y) for all (x, y) in A, such that h(0) =
and the flow function f(x, y) is required to have nonnegative integers as its range.
The cost function for the entire network is
X h[f(x,y)].
all Or, y)tA
This cost function is a sum of convex functions and thus convex. Let
(6) h[f(x,y)]=h[f(x,y) + l]h[f(x,y)]
iorf(x, y) 3= and all (x, y) in A, and
(7) h[f(x,y)] = h[f(x,y)l]h[f(x,y)]
iorf(x, y) > and all (x, y) in A.
Expression (6) defines the upcost of an arc and (7) the downcost of an arc. It is shown in [14],
that h{ ) > and h{ ) < 0; h(a) < h(b) for a < b and \h(a)\ < \h(b)\. A particular flow called a
pathflow is a flow with/(s, x)=f(x, y)= . . . =f(z, t) = l and f(u, w)=0 for all u, w 4= s, x, y,
. . ., z, t. If the cost of a flow with value i' is known and we superimpose a path flow on this given
flow, the resulting flow has value v+ 1. h is used if the arc flow of the path flow is of the same direction
as that of the arc flow of the flow with value v and h is used if the two flows are of opposite directions.
The sum of h and h used in the path flow is called the incremental cost of the path flow.
An iteration of the convex NFUU algorithm first requires construction of a modified rnetwork
from the current flow in the original rnetwork. A shortest route algorithm is then applied to determine
the shortest route from source to sink in the modified rnetwork using the up cost and down cost
(h( ) and h{ ), respectively) of an arc as its length. One unit flow is then forced from source to
sink in the original network along a route corresponding to the shortest path obtained in the modified
rnetwork. The up cost and down cost of all the arcs in the modified rnetwork are redefined based on
the new flow pattern of the original network. This cycle is repeated until the amount of flow at t in the
original network is /.*
Given G (the original network) and I (a positive integer corresponding to an input flow to the net
work), the precise algorithmic statements for the convex NFUU algorithm are:
STEP (Initialization): Set/(*, y) = for all {x, y) in A.
STEP 1 (Network Modification): Given the current flow in the original network f(x, y) for all
(x, y) in A, define a modified rnetwork as follows:
a) If =£ f{x, y) ^ q(x, y) leave the arc (jc, y) in the modified rnetwork with cost h[f{x, y)] as
defined in (6)
b) If/(*, y) = q{x, y) , delete the arc {x, y) from the modified rnetwork
c)if0< f(x,y) and
i) x, y are in N%, add a reverse arc (y, x) in the modified rnetwork with cost h[f(x, y)] as
defined in (7)
ii) x is Ni and y is N>, add reverse arcs (y, x) and (z, y) in the modified rnetwork for all z in
"Definition of/ is given in the next sentence.
NETWORK FLOWS UNDER UNCERTAINTY 521
a{y) , the former with cost h[f(x, y) ] as defined in (7), and the latters with cost zero
iii) x is N 3 and y is N\ , add reverse arcs (y, x) and (x, z) in the modified rnetwork for all z in
b(x), the former with cost K[f(x, y)] as defined in (7) and the latters with cost zero.
Both b) and c) must be done if f(x, y) = q(x, y) for all (x, y) in A.
STEP 2 (Shortest Route): Determine the shortest route or routes of minimal expected discounted
cost from s to t in the modified rnetwork. Use properties (1) and (2) of rnodes and (3) and (4) for cnodes.
Apply any shortest route algorithm [9, 11] with h( ) and h( ) as lengths in the arcs of the modified
rnetwork.
STEP 3 (Flow Augmentation): Send one unit flow from 5 to t in the original network along the route
corresponding to the shortest path just obtained in Step 2, that is, along the path whose incremental
cost in the modified rnetwork relative to the existing flow in the original network is minimum.
STEP 4 (Iteration and Stopping Rule): If the amount at t in the original network is /, stop. Other
wise return to Step 1 with the current flow.
If, during the application of Step 2, no shortest path exists from s to t in the modified rnetwork, the
original problem is infeasible.
The validity and convergence of the algorithm is proven in [7] for the linear case. The next theorem
proves the validity and convergence of the algorithm for the convex case.
Lety= 1 stand for the source s andy = m for the sink t. We will next prove that the convex NFUU
algorithm is equivalent to compute a flow vector, /= (/i), i= 1, . . ., k (k is the total number of ele
ments in A) and a vector b = (bj),j = 2, . . ., m — 1, whose components are the amounts of flow at
the nodes (bi — I, b m — — I), which
Minh(f)
subject to
Df = b
(8) //*£ q
/>0,
where Df=b merely states that:
a) the total inflow and total outflow of every node in N\ must equal the amount of flow at the node,
b) every incoming flow (outgoing flow) of an r (c) node equals the amount of flow at the node,
c) every replication* (collating*) flow of an r (c) node equals the amount of flow at the node,
k
q— (qi), i=l, . . .,k is the vector of arc capacities,/ is a£ X k identity matrix and h{f) = ^ [hi(ft)],
where each M/i) is a realvalued convex cost function on/j, i= 1, . . ., k.
THEOREM 1: Assume that at the end of iteration 5(5 > 0) of the convex NFUU algorithm, f s is a
feasible solution to the convex programming problem
(9) Min h(f)
*The flow on each of the outgoing (incoming) arcs of an r(c) node is called replication (collating) flc
522 J. PRAWDA
subject to
Df=V
/>0.
We let /° = and f s <I where / is the input flow to the NFUU.
For this f s suppose /* is optimal to
(10) Min/»(/)
subject to
Df=v
10 if arc i is not in the path flow
1 if arc i is in the path flow,
^here
for i=l A and
10 if node j is not in the path flow
1 if node j is in the path flow,
for j — 1, . . ., m. Then f s+1 — f s + f s is optimal for
(11) Min h(f)
subject to
Df = b' + V
/2*0.
Pf: For /;(/) linear, the proof is in [7]. Assume h{f) convex with h(0) = 0. First we will prove /* +1
is feasible for (11). Adding the first m constraints of (9) and (10) yields
NETWORK FLOWS UNDER UNCERTAINTY 523
D(f* + f°) = b' + r,
or
Df s+l = b' + q.
/* +1 is always bounded below by zero, since /* 5 s from (9) and/* 2= — /* from (10). The last 2k con
straints of (10) yield
adding /■ 5= to the above inequality
//« + //• =£ //* + If* =£ q  /* + /*.
The above inequality yields
*s //* +1 ^ g ,
and thus/* +1 is feasible to (11).
Next, we prove the optimality of/* +1 . Let the cost associated with problem (9) be h(f s ) and let
h{f*) be the optimal incremental cost of problem (10) corresponding to the optimal path flow / s . It
follows from the optimality of/* that h{f*)^h(f s ) for any path flow/* feasible to problem (10).
Since the cost of the new flow/* +l in the original network equals the cost of the old existing flow
/* in the original network plus the incremental cost of the path flow/*, in the modified rnetwork, it
follows that
Hf' +i ) = h(f'+p) = h(f') + h(f*)^h(f') + h(f») = h(f*+f*)=h(f*)
for any feasible/* to problem (11). Then/* +1 is optimal to (11). Q.E.D.
The last theorem proves that in terms of the NFUU, at the end of each iteration, the flow will be
optimal for the amount thus far placed into the source of the NFUU. The convergency of the algorithm
follows from the fact that/* +l is bounded above by a finite integer /and at each iteration / s+1 increases
by a flow of value one.
Next we suggest the application of the above convex NFUU algorithm to the solution of two prob
lems, one given in section 3 and the other one in section 4.
3. AN JVPERIOD, 2ECHELON PRODUCTIONINVENTORY SYSTEM
Interest in the multiechelon inventory systems has been spurred by the existence of large mili
tary logistics networks and private industry. There are a number of papers on single and multiproduct,
multiinstallation inventory models that have been published. A comprehensive review of these topics
can be found in the excellent published bibliographies of Iglehart [17], [18], Scarf, Gilford and Shelley
[20], and Veinott [22].
Several approaches have been used to compute optimal Aperiod reordering points in the pre
ceding multiechelon inventory systems. For instance, Bessler and Veinott [3], using the assumption
that stock left over (backlogged) at the end of the A periods in each facility can be salvaged (purchased)
at the same stationary unit price, decompose an A variable linear cost function into the sum of A — one
variable linear functions, and a stationary policy given by a critical vector is shown to be optimal.
524 J PRAWDA
Relaxing Bessler and Veinott's [3] assumption, the Afperiod problem will, in general, not de
compose to A — one period problems and dynamic programming is used to compute the optimal policy.
Others, such as Clark and Scarf [5, 6] have used dynamic programming. However, its use has been
shown to be computationally infeasible for even simpler problems than the one considered by Bessler
and Veinott. (See [17, 18].)
The objective of this section is to suggest the use of the previous convex NFUU algorithm to
solve yVperiod, multiechelon production and delivery inventory systems. It is the structure of the
NFUU networks that allows for some computational improvement to obtain optimal production and
delivery schedules of /Vperiod, multiechelon stochastic inventory problems, with low item demands,
with respect to other techniques used to solve similar systems, such as dynamic programming.
This paper is concerned with the problem of scheduling the production jcoi, *02, • • •■> x on and
allocation Xn, . . ., x t \, xzu ■ ■ ,X2.\, ■ ,x n i,. ■ ., x h n of a single product in facilities 1 , . . ., rain suc
cessive time periods 1,2, . . ., /V so as to minimize the total expected discounted costs over the
N periods. The requirements of each facility are discrete random variables each of which has known
probability mass function.
Figure 2 illustrates a twoechelon system consisting of a plant, a warehouse 0, and n facilities
numbered 1,2, . . ., n. Although we are interested in more general multiechelon systems, the preceding
one will suffice to illustrate our approach. Some remarks concerning the generalization to more complex
multiechelon systems are given at the end of this section.
KNOWN
PRODUCTION
CAPACITY
FOR THE N
PERIODS
PLANT
\= I, ,N
11 UNCERTAIN
REQUIREMENT
FACILITY I
UNCERTAIN
REQUIREMENT
FACILITY n
Figure 2.
At the beginning of period one we consider a known production capacity for the N periods. Let
/ denote the production capacity. The production capacity could be present in this problem due to
restrictions of raw material to produce the given single commodity. Uncertain requirements in facility
i (i=l, . . ., n) in each period are satisfied insofar as possible from stock on hand at the beginning
of the period in that facility and from the allocation and production of the commodity at the beginning
of the period. Requirements which cannot be met in a given period (because, for example, limited
production) are back logged until they can be satisfied by subsequent production or allocation in future
periods.
NETWORK FLOWS UNDER UNCERTAINTY 525
Let au (i — 1, . . ., n; t—1, . . ., N) be a parameter associated with each facility i in any period
t. this parameter will take the following values:
a«=l,2, . . ., n it fort'=l, . . ., n and t=l, . . .,N.
Let {Du, i=l, . . ., n; t=l, . . . , N} be a family of discrete, nonnegative random variables. For a
fixed i and t, Du takes on values in {d\, di, . . ., d^,}, a set of nonnegative real numbers. Let
P^=P{Z) tt =<*„„} for alia,,.
The sequence {Pa U } of real numbers will be a probability distribution of D it with P^ ^ 0, and
"it
This distribution is assumed to be known. Du is not necessarily independent* between any two facili
ties or identically distributed for successive periods. Let to,, A (an, a,2, . . ., a«), i=l, . . ., n
and 1 =£ t «S N, be the index associated with the random variables defined below. This notation identi
fies the sequence of realizations in facility i up to period t. Thus for example wo A (2, 1, 5) denotes
the sequence of following events in facility i: In period 1 realization 2, in period 2 realization 1, and
realization 5 in period 3.
Let
(l)0t A (a>n, G>21> • • ., 0)nl, . . ., 0)u, <02t, • ■ ■ , <Ont) ■
We let D^ "" for i = 1 , . . . , n and t = 1 , . . . , TV be the vector of realizations caused by the stochastic
requirements in facility i up to period t. It is conditioned on all previous realizations in that facility in
previous time periods, that is£M" n) , . . •»#J. w ji'~ 1) and occurs with conditional probability P(a>if), where
P(<o il )=P{D it = d a . l \D i , t  l = da i>t _ v . . .,Dn=d ail }.
Let ^°' ) (3= 0) be the production completed in the plant (warehouse) at the beginning of period t
given the sequence of random events &>o< and *j"" ) (3 s 0) (i= 1, . . . , n), the allocation completed in
facility i at the beginning of period t {t=\, . . ., N) given the sequence of random events (oit. Let
j(wot)( t=z ]^ ., N) be the inventory at the end of period t in the warehouse given the sequence o>o<
and I u a,U) (i= 1, . . ., n; t = l, . . ., N) be the inventory at the end of period £ in each facility i given
the sequence o>u. We will assume that the initial inventory at all facilities and the warehouse is zero at
the beginning of period one.
In order to simplify the statement of the problem, we are provisionally going to suppress the index
(t>u and merely refer to the random variables %u and Iu, i = 0, 1, . . ., n, t = 1, . . ., N.
*If for a fixed i, Du(t=l, . . ., W) is a sequence of dependent random variables, then the marginal distribution P„ jk
(1 =S k S N) can be obtained from the given joint probability distribution.
526 J PRAWDA
We will assume that the lead time in production and delivery to the n facilities is zero. The inventory
level equation becomes
£ UiA2 x kh ) for i =
/i = l k=l
t
2 {xih — Dih) for i*=l, . . ., n,
where t— 1, . . ., iV. Let j8, be a nonnegative integer denoting the number of periods of backlog per
mitted for facility i, thus
t
I it ^ ]g D ik for all i=l, . . ., n.
k = I  0j + 1
Note that, in general, we would have
<=i
which imphes that it may be optimal not to use all the production capacity / through the TV periods. Let
ku be the known capacity of the production line for i=0 and of the transportation facilities for i >0
(i = l, . . ., n) during period t (t= 1, . . ., N). Let Qu be the known storage capacity for the ware
house (t = 0) and the n facilities (i= 1, . . . , n) during period t (t= 1, . . ., N). Thus we will require
that
and
xu < ku for i = 0, . . .,nandf=l, . . ., N
Iu^Qu fori = 0, . . ., n andt=l, . . .,N.
Let
z
= ( r «"01> ("O/V* J>11> 1"1N> J w nl> J^nJV' "1
\*oi ' ' " '' *o\ '*n ' • • •'•*!« '  • ''ni ' • * "'•Sat /
be the schedule vector for the entire system, given the sequence of realizations in all facilities i,
i=l, . . . , n up to period N.
We have the following costs in each period: production, shipping, holding, and shortage (the last
one due to backlogged demand). The preceding costs are assumed to be convex functions of the quan
tities produced and delivered at the beginning of the period and of the quantities stored or backlogged
*In terms of the NFUU this can be accomplished by arcs representing unused production capacity in the plant at each
period t, t = \, 2 N.
NETWORK FLOWS UNDER UNCERTAINTY 527
at the end of the period, respectively. The cost functions for successive periods need not be the same.
(1
Let 8 t , =£ 8, ^ 1 be the discount factor for period t. Let yi = l and y t = 8 ( , for t > 1. The
j=i
total expected discounted cost F{z) is defined to be the sum of the following expected discounted
costs:
i) Total expected discounted production and holding cost in the warehouse:
2 yr[C,(* M ) +#<«(/<»)].
ii) Total expected discounted transportation, holding, and penalty cost in all facilities:
£i>' [Tu(xu)+H u (Iu)h
where
H it = Max {hit(Iit) , pa (lit)},
where hu{ • ), pu{ ■ ) are convex functions of their arguments satisfying
r>0if/i,>0
haUit) puVit) = if / l7 =
l<0if/ft<0,
hu() and pu() are, respectively, the expected holding and penalty costs,* for i — 0, . . ., n and
t=l, . . .,N. Ct('), hit(') , Pit(') , and Tu(') are convex functions of their respective arguments with
C t (0) = h it (0)=p il (0) = T it (0) =0for i = 0, . . ., n and t=l, . . ., N.
Then the problem can be stated as: given the uncertain market requirements for each of the n
facilities over the next /V periods and the production capacity for the N periods, find a production
allocation schedule z, called optimal, which minimizes the total expected discounted cost
F(z) = J y* ■ [C t (x ot )+H 0( (ht) + 2 {Tu(x it ) + Hit(Iu)}]
subject to
^ *<« ^ h
*Hu(Iit) may not be convex for lu = 0. However, in terms of the network flow approach there are going to be arcs with
expected cost hn( ■ ) corresponding to storage of inventory in facility i at the end of period t and different arcs with expected cost
p«( ' ) corresponding to backlogged inventory in facility i at the end of period t. Thus Hn(lu) is never used explicitly in the NFUU.
528 J PRAWDA
O^xu^Ku for i = 0,,. . .,nandt=l, . . ., N,
and
where
Iu^Qu for i' = 0, •, . . ., n and t=l, . . ., N,
X ( Xih ~ 2 Xkh ) for i =
/i
/« =
(
^ (xit — Du) fori=l, . . ., n
h=\
and
i it ^ y />*.
In order to solve the above problem we suggest that the above twoechelon multiperiod, stochastic,
productiondelivery problem be rewritten in terms of the NFUU (see [7]), and then that the convex NFUU
algorithm (described in section 2), be used to compute the optimal production and delivery schedules.
Since the NFUU decomposes into a set of subproblems which are small network flow problems, this
algorithm seems more attractive than dynamic programming, specially for multiechelon inventory sys
tems with low demand items (01 demands) and small number of time periods. It was shown in Ref. [7]
that the amount of computer storage required by the NFUU algorithm is proportional only to n, the
dimension of the onestage problem.
The approach suggested in this paper can be applied to more general multiechelon systems than
the one depicted in Figure 2 (where transhipments between facilities are allowed), with consequent
increase in the number of arcs and nodes in the NFUU.
4. A MULTIPERIOD, STOCHASTIC VERSION OF THE PARAMETRIC BUDGET
PROBLEM
Suppose that a fixed budget of v dollars can be allocated among the productionline, transportation,
and storage facilities of the existing productiondelivery inventory system for the purpose of increas
ing the production capacity / for the N periods. The cost of increasing the capacity of the production
line (i = 0) and transportation facilities (i=l, . . ., n) during period t{t= 1, . . ., N) is Vi, t dollars
per unit increase. The cost of increasing storage capacity at the warehouse (i = 0) and the n facilities
(i=l, . . ., n) during periods {t= 1, . . ., N) is v g{ . ( dollars per unit increase.
Let w it u and w a be decision variables corresponding, respectively, to the amount of additional
capacity to be built in the productionline (i = 0), transportation (i= 1, . . ., n), and storage facilities
during period t(t=l, . . ., N) dependent upon the sequence oiu of random events in facility i up to
period t. Then the problem of increasing the production capacity for the W periods to, say, /' (/' >/) is
to minimize the following expected expansion cost.
NETWORK FLOWS UNDER UNCERTAINTY 529
n N
(<■>«>_!_.. . Dt,. \ . «t (w tt>'
MinX E ("«PM»7+"*« •£(•«) ' w «« )
subject to
i*sr , 7'
<=i
0^^" ) <^ 1 r + ^r" ) for i = 0, 1, . . .,nand*=l W
I^^Qu+w^ fori = 0, •, . . .,nand*=l, . . ., N,
where
2 W 0 2 4 W / A) ) for£ =
/r
<"«)=
ft = 1 A = 1
I
^ Uir ,7) ^?"' ) ) fori=l, . . .,n
ft =
'it Zj ik '
fc = f 0,11
A related problem to the one just given above would be to maximize the production capacity/' (now a
decision variable) for the iVperiods with a fixed budget of v dollars. This problem is stated as
Max/'
subject to
n N
£ 2 (vu'P((o u )'W^+v tit P(m) • «&»>) = v
1 = t = 1
where
=£*(?">*££„ + u><?"< ) for i = 0, •, . . .,raandf=l N
I\fri*zQit+wl%ri forZ = 0, . . .,nandt=l, . . .,N,
2 (*fe tt)  2 *fe? fc) ) fori =
A = l
/f«) =
2 (*£"<> 0fe«>) fori=l, . . .,*
A = l
k=t0i+l
530 J. PRAWDA
In terms of the NFUU G— [TV; A] the preceding two problems are seen to be an extension of the
deterministic parametric budget problem solved by Fulkerson [13] and Hu [14, 16], to the multiperiod,
stochastic case.
The algorithm for solving the preceding two problems now follows:
STEP (Initialization): Set/(s, y) = for all (x, y) in A.
STEP 1 (Network Modification): Given f(x, y) for all {x, y) in A define a modified NFUU as
follows:
a) If/(*, y) < q(x, y) then h[f(x, y)] = 0;
b) If f(x, y) 5* q(x, y) then h[f(x, y)]=g(x, y);
c) U0<f(x, y) =S q(x,y) then h\f{x, y)] = 0;
d) Uf(x, y) > q{x, y) then h [f(x, y)] = — g(x, y);
where g(x, y) is the capacity expansion unit cost for arc {x, y) in A and q(x, y) is the original capacity
of arc (x, y) in A. Obviously properties (1) and (2) of rnodes and (3) and (4) of cnodes must hold.
STEP 2 (Shortest Route): Send one unit flow from s to ( in the original network along a route
corresponding to the shortest path just calculated in the modified network, that is, the path in the
modified network whose incremental cost is minimum. Apply any shortest route algorithm [9, 11]
with h( ) and h( ) as lengths.
STEP 3 (Flow Augmentation and Stopping Rule): If the amount of flow at t is /' in the original
network or the total amount of money used up is v, stop; otherwise return to Step 1 with the current
flow.
It is obvious that*:
wu or Ws it =f(x, y) — q(x, y) if f(x, y) > q(x, y) for some (x, y) in A, and
wn or Ws u = if y(jc, y) =£ q{x, y) for some (x, y) in A.
ACKNOWLEDGMENTS
My sincere thanks to Professors Gordon P. Wright and Larry R. Arnold for their helpful com
mentaries.
BIBLIOGRAPHY
[1] Arrow, K., S. Karlin, and H. Scarf, (eds.), Studies in the Mathematical Theory of Inventory and
Production (Stanford University Press, Stanford Calif., 1958).
[2] Beale, E. M. L., "An Algorithm for Solving the Transportation Problem when the Shipping Cost
over each Route is Convex," Nav. Res. Log. Quart. 6, 4356 (1959).
[3] Bessler, S. A. and A. F. Veinott, Jr., "Optimal Policy for a Dynamic MultiEchelon Inventory
Model," Nav. Res. Log. Quart. 13, 335389 (1966).
[4] Busacker, R. G. and P. J. Gowen, "A Procedure for Determining a Family of Minimal Cost Net
work Flow Patterns," Technical Rept. No. 15, Operations Research Office, Johns Hopkins
University, Baltimore, Md. (1961).
*We assume that the function mapping the subscripts (i, t) i= 0. •. •. . ., n, t=l, . . . , /V with the arcs (jr. y) in A is
known from the structure of the NFUU.
NETWORK FLOWS UNDER UNCERTAINTY 531
[5] Clark, A. and H. Scarf, "Optimal Policies for a MultiEchelon Inventory Problem," Management
Science, 6, 475490 (1960).
[6] Clark, A. and H. Scarf, "Approximate Solutions to a Simple MultiEchelon Inventory Problem,"
Chapter 5 in Studies in Applied Probability and Management Science by Arrow, Karlin, and
Scarf (eds.), Stanford, University Press, Stanford Calif. (1962).
[7] Connors, M. and W. Zangwill, "Cost Minimization in Networks with Discrete Stochastic Require
ments," Operations Research 19, 794821 (1971).
[8] Dantzig, G., Linear Programming and Extensions (Princeton University Press, Princeton, N.J.,
1963).
[9] Dreyfus, S. E., "An Appraisal of Some ShortestPath Algorithms," Operations Research 17,
395412 (1969).
[10] ElAgizy, M., "Dynamic Inventory Models and Stochastic Programming," IBM Journal of Research
and Development (1969), pp. 351356.
[11] Ford, L. R. and D. R. Fulkerson, Flows in Networks (Princeton University Press, Princeton, N.J.
1963).
[12] Ford, L. R. and D. R. Fulkerson, "Constructing Maximal Dynamic Flows from Static Flows,"
Operations Research, 6, 419433 (1958).
[13] Fulkerson, D. R., "Increasing the Capacity of a Network, The Parametric Budget Problem,"
Management Science 5, 472483 (1959).
[14] Hu, T. C, "Minimum Convex Cost Flows," Nav. Res. Log. Quart. 13, 119 (1966).
[15] Hu, T. C, "Recent Advances in Network Flows," SIAM Review 10, 354359 (1968).
[16] Hu, T. C, Integer Programming and Network Flows (Addison Wesley Publishing Co., Reading,
Mass., 1969).
[17] Iglehart, D., "Recent Results in Inventory Theory," J. Indust. Eng. 18, 4851 (1967).
[18] Iglehart, D., "Recent Developments in Stochastic Inventory Models," Invited Paper at the Na
tional Meeting of ORSA, June 19, 1969, Denver, Colorado.
[19] Prawda, J. and G. P. Wright, "On Some Applications of Network Flows Under Uncertainty,"
Proceedings of the International IEEE Conference on Systems, Networks, and Computers,
Oaxtepec, Morelos, Mexico (Jan. 1921, 1971).
[20] Scarf, H. E., D. Gilford, M. Shelley, Multistage Inventory Models and Techniques (Stanford Uni
versity Press, Stanford, Calif., 1963).
[21] Veinott, A., Jr., "Optimal Policy for Multiproduct, Dynamic, Nonstationary Inventory Problem,"
Management Science 12, 206222 (1965).
[22] Veinott, A., Jr., "The Status of Mathematical Inventory Theory," Management Science 12,
745777 (1966).
[23] Zangwill, W., "A Deterministic Multiproduct, Multifacility, Production and Inventory Model,"
Operations Research 14, 486507 (1966).
[24] Zangwill, W., "The Shortest Route Problem under Either Concave or Convex Costs," Presented
at the 12th Annual Operations Research Society of America Meeting, Santa Monica, California
(1966).
[25] Zangwill, W., Nonlinear Programming, A Unified Approach (Prentice Hall, Inc., Englewood
Cliffs, N.J., 1969).
CONCAVE MINIMIZATION OVER A CONVEX POLYHEDRON
Hamdy A. Taha
University of Arkansas
ABSTRACT
A general algorithm is developed for minimizing a well defined concave function over
a convex polyhedron. The algorithm is basically a branch and bound technique which
utilizes a special cutting plane procedure to' identify the global minimum extreme point
of the convex polyhedron. The indicated cutting plane method is based on Glover's general
theory for constructing legitimate cuts to identify certain points in a given convex poly
hedron. It is shown that the crux of the algorithm is the development of a linear underesti
mator for the constrained concave objective function. Applications of the algorithm to the
fixedcharge problem, the separable concave programming problem, the quadratic problem,
and the 0~1 mixed integer problem are discussed. Computer results for the fixedcharge
problem are also presented.
I. INTRODUCTION
Consider the problem
(1) min/U),
where x= (*i, x 2 , . . ., x n ) and Q— {xeE n \ Ax= b, x^O}. The function of j\x) is assumed to be
concave and well defined over the convex polyhedron Q. It is also assumed that the contrained minimum
of f(x) is finite.
The optimum solution to (1) is characterized by its occurence at an extreme point of Q. However,
the principal difficulty is that a local minimum is not necessarily global.
A method for solving (1) was proposed by Hoang Tuy [12], but with the additional requirement that
f(x) be concave over all xeE n . Tuy's algorithm is started by identifying a local minimum point, x, of Q.
A hyperplane cut (called Tuy's cut) is then determined and augmented to the problem so that all
feasible (extreme) points in Q having a worse value than f(x) are excluded. Informally, Tuy's cut is
generally defined by a hyperplane passing through the end points of the extended halflines emanating
from the current local minimum such that the associated values of f(x) at these end points is equal to
f{x). It is clear that/(jc) is an upper bound on the optimal objective value and that any extreme point
xeQ having f{x) 3= f{x) cannot be promising. The process is then continued by searching for a local
minimum of the new solution space resulting from the application of the last Tuy's cut. If no new local
minima exist, the algorithm is terminated with the last local minimum being the global optimum.
Tuy provides no convergence proof for the algorithm.
This paper presents a new algorithm for solving (1). The algorithm is basically a branchandbound
method which utilizes a special cutting plane procedure to identify the global extreme point of Q.
The main difference between this work and Tuy's is that the cuts are generated solely from the geom
533
534 H. A. TAHA
etry of the convex polyhedron Q. Also, the identification of the candidate extreme points of Q necessi
tates defining a linear function which underestimates f{x). The linearity restriction is important
since, as will be seen later, it reduces the problem to solving a series of linear programs.
Although a method is given for developing a linear underestimator for the general case, illustra
tions for developing more efficient (or tighter) underestimators are also provided for important concave
minimization problems.
In section II, the generalized branchandbound algorithm is presented and its relationship to
work by other authors is discussed. Section III introduces the cutting plane method associated with
the algorithm. Section IV develops a general linear underestimator for/(*) and shows how "tighter"
underestimators are developed for the fixedcharge problem, the separable programming problem,
and the 01 mixed integer linear problem. Finally, section V illustrates the computational efficiency
of the proposed algorithm as applied to the fixedcharge problem.
II. THE ALGORITHM
The general idea of the algorithm is explained as follows. Let l(x) be a linear underestimator of
f{x) over Q; that is,
(2) /(*)*£/(*), xeQ,
then it is clear that
(2') min {l(x)\xeQ}^ min {/ (x) \xeQ}
X X
This means that by starting with the extreme point x° satisfying min{/(;t) jce()}, /= /(jc°) is a lower
bound on the optimum objective value of (1), while, from (2), an obvious upper bound is given by
f—f(x°). Now consider jc'(# x°), an adjacent extreme point to x° such that /(*') yields the smallest
l{x) among all the adjacent extreme point of x°. It is clear that only those adjacent point having/
=s l{x) *£/ need be considered in determining x l . The point x 1 is then said to be the next ranked ex
treme point. (The exact details of the general (cutting plane) procedure for determining the next ranked
extreme points will be presented in section III.) Now, the new lower bound is f— lix 1 ). The upper
bound/is changed to/(ac') only iff(x l ) is smaller than the current upper bound /=/(x°).
In general, suppose E*' 1 — {x°, x 1 , . . ., x l ~ 1 } is the set of (nonredundant) extreme points thus
far ranked. Then x\ the next ranked extreme point, is selected as the adjacent extreme point to one of
the elements in E*~ l such that l(x { ) is again the smallest among all such adjacent extreme points and
provided x' C\ E'' 1 = (f>, that is, x { is nonredundant with respect to E*' 1 . The current lower bound is
now given by/= /(*'), but the upper bound is changed tof(x { ) only if this quantity is smaller than
the best available upper bound/.
The termination of the procedure is effected at x k if f = l(x k ) 3 s / with the extreme point as
sociated with /being the optimum. This follows since from (2),
f(x)^l(x k )^f, xeQ*E\
where Q* is the set of extreme points of Q. This condition shows that all the remaining extreme points
(Q* — E k ) can only yield worse objective values than/, and are thus nonpromising.
CONCAVE MINIMIZATION 535
The above discussion can be summarized in an algorithmic form as follows:
STEP 0: Solve the linear program:
min {l(x)\xeQ}
X
and let x° be the optimum extreme point. Define f(x°) = l(x°) as the lower bound on the optimum
objective value of (1). Let x* = x°. Then f(x*) is an upper bound. Set i — 0, then go to Step 1.
STEP 1: The current upper and lower bounds are given by f{x*) &n&f{x'), respectively. Let
x i+1 be the next ranked extreme point of Q and set the new lower bound f(x' +i ) = l(x' +l ). Go to Step 2.
STEP 2: If l(x i+1 ) 2* /(**), stop; x* is optimum. If f(x i+l ) < /(**) , set x* = x i+1 ,f(x*) =f(x i+1 ).
Otherwise, the upper bound remains unchanged. Set i = i + 1 and go to Step 1.
The general idea of the above algorithm was first proposed by Katta Murty [7] for solving the
fixedcharge problem. Murty also indicated [7, Corollary 1] that for f{x) — D(x) + z{x), where z(x)
is linear and D(x) is concave, if l(x) is taken equal to z(x), the algorithm is equally applicable. However,
it is clear that Murty's corollary is true only ifz(*) =S z(x) + D(x). This obviously is not valid, in general.
Later, Cabot and Francis [3] utilized the exact algorithm to solve the case where D(x) is a negative
(semi)definite quadratic form. (See section IV.) The CabotFrancis paper, however, presents the
details of Murty's algorithm in a more explicit manner.
The ranking procedure of Step 1, as advanced by Murty, determines the adjacent extreme points
to each element (basis) in E' as the (new) basic solutions in which one of the current (eligible) non
basic variables is made basic. This requires carrying out a single pivot operation as in the simplex
method. The major drawbacks of Murty's procedure is that the number of generated adjacent extreme
points may become very large to the extent of taxing the computer memory. Moreover, because the
same (adjacent) extreme point may be generated from more than one element in E', a procedure is
needed to avoid storing redundant points. The extensive experimentation by this author shows that
Murty's algorithm, as applied to the zeroone problem, generally yields very discouraging results
(See [10]). The complex bookkeeping procedures required to economize the utilization of the computer
memory and to minimize redundancy shows distinctly that the algorithm can very easily reach an
unmanageable state.
This paper differs from the work of Murty in two aspects:
(i) It presents a general algorithm which solves any problem of type (1). This is in contrast with
Murty's (or Cabot and Francis') work which leaves the impression that it can handle specialized con
cave problems only.
(ii) It develops a new procedure for the details of Step 1 of the algorithm which improves on the
drawback of Murty's ranking scheme. The new procedure utilizes a cutting plane technique which
uses the "convexity cuts" recently developed by Glover [5].
It must be noted that, by using Murty's ranking procedure, the requirement that the underesti
mator l{x) be linear is needed only in Step 0. This follows since the theory of linear programming auto
matically allows the determination of the proper extreme point, x°. Clearly, the linearity assumption
is not needed in the ranking procedure of Step 1. This is in contrast with the new ranking procedure,
to be introduced in the next section, where the linearity of the underestimator is a mandatory require
ment. This follows since the ranked extreme points are determined by applying the dual simplex method
of linear programming.
536 H. A. TAHA
III. A CUTTING PLANE METHOD FOR RANKING THE EXTREME POINTS OF Q
Informally, the idea of the new ranking scheme is explained as follows. Start with x° obtained at
Step 0. Then define a cut that eliminates x° only from among all the extreme points of Q. The hyperplane
associated with the cut is determined to pass through the adjacent extreme points of* . Now, augment
ing the linear programming problem with the cut and applying the dual simplex method, the resulting
optimum feasible solution yields the next ranked extreme point. A new cut can now be generated from
the adjacent extreme points by using the new solution space. The process is repeated as necessary.
The above procedure is tailored after a recent development by Glover [5] who lays a general
theory for constructing legitimate cuts which can be used systematically to determine certain points
in a given convex polyhedron. A typical illustration is the convex polyhedron Q with its extreme points
representing the set of points to be identified. Glover's theory actually generalizes earlier ideas by
Young [13] and Balas [1] where they developed legitimate cuts for the integer linear programming
problem.
To formalize the above discussion, let the current basic solution be defined by the set of equations
(3) y t = b i0 ^ b u tj, ieM, y„ tj 2* 0,
where yt and tj are the basic and nonbasic variables, respectively. The sets M and N define the indices
of the basic and nonbasic variables. The cut referred to above is now described based on Glover's
theory:
Glover's Convexity Cut Lemma [5]: Let S be a set of points in the convex polyhedron Q. If R is a
convex set whose interior contains no point in S, and if y;=6,o, i*M (possibly a boundary point of/?)
has a deleted feasible neighborhood which lies in the interior of/?, then for any constant t* > 0,jeN,
such that
yt= b i0 — bijt*eR, V ieM,
the convexity cut t
(4) ^ 'il'l * l
excludes the extreme point yi=6io, ieM, but never any point in 5.
The application of the above lemma to Step 1 of the algorithm is straightforward. Here the set S
consists of the unranked extreme points of the current solution space. The point y; = bio, ieM, takes
the place of the current "ranked" extreme point, and the set R is represented by the convex poly
hedron describing the current solution space.
The determination of the constants t * in (4) follows directly from the theory of the simplex method;
that is,
t Because of the convexity requirement stipulated on the Set R, Glover coins the suggestive name "convexity cut."
CONCAVE MINIMIZATION 537
bij >
„• bio
min _ii
im bij
all bij^O.
Clearly, t* is strictly positive if the current solution is nondegenerate, that is, 6jo > 0. When
bto= for at least one icM, then it is possible that t* = and the convexity cut (4) becomes undefined.
In order to overcome the above difficulty resulting from a degenerate situation, we use the follow
ing procedure due to Balas [l].t
Degeneracy occurs when an extreme point is "overdetermined," that is, when the current solution
point has more than n hyperplanes associated with it, where n is the total number of variables. Balas
[1] proves that by dropping each constraint for which the associated basic variable is equal to zero,
the resulting convex polytope necessarily associates n distinct edges with the current solution vertex.
Under this condition, the values off* are readily determined. Of course, when the cut is added, all
the deleted constraints must be activated before the problem is reoptimized; unless such constraints
are proved to be redundant with respect to R in which case they can be dropped completely.
There are two important points which must be considered in association with the degeneracy prob
lem. These difficulties do not arise in Balas' case mainly because the sets/? andS in his problem remain
unaffected by the deletion of the constraints associated with the zero basic variables. This obviously
is not the case in our situation.
(i) Let C be the degenerate cone associated with the current solution vertex, X, and define L as
the polytope obtained from the current solution space of the problem by deleting the halfspaces asso
ciated with C. Further, define C as the nondegenerate cone associated with X which is obtained from
C by deleting all the constraints satisfying Balas' condition.
Since C C C , then the cut obtained from the adjacent extreme points resulting from the inter
section of C with L cannot be stronger than its equivalence when C replaces C . This means that
the new cut cannot eliminate any of the extreme points of R(=C D L) which have not been tested for
optimality. Subsequently, the cut obtained by using C is legitimate with respect to R.
(ii) The cut obtained by using C will most likely create new extreme points which are not part
of the vertices of the original solution space. The question then arises as to the possibility of the op
timal solution being "trapped" at one of these vertices. This point is refuted as follows:
By the convexity cut lemma, such extreme points (when they occur) must lie on the halfline
(5 ) y, = b i0  bijtj , < t j < t* , ieM.
If (5) is an edge (or segment thereof) of the original solution space Q (defined in (1)), then the new extreme
point is actually a nonvertex of Q. Consequently, it cannot yield an improved solution point as this leads
to contradiction. A similar argument applies if (5) is a new edge resulting from the application of
previous cuts.
tit must be noted that Murty's procedure overcomes the degeneracy problem by enumerating all the basic solutions associ
ated with the current extreme point. From the computational point of view, this has proved to be very time consuming (see
[10]).
538 H. A. TAHA
It is important to notice that the effect of degeneracy goes beyond simple inconvenience in compu
tation. Essentially, the creation of new extreme points must reduce the efficiency of the proposed
method since it may be necessary to test these points for optimality (see the numerical example in sec
tion VI for an illustration). Consequently, it seems important that serious consideration must be given to
minimizing the effect of degeneracy. The work of Thompson, Tonge, and Zionts [11] provides ways for
eliminating degeneracy in certain situations (as illustrated by the numerical example in section IV).
However, there does not yet exist a general method for handling all degeneracy situations.
IV. DETERMINATION OF THE LINEAR UNDERESTIMATOR l(x)
In this section we show how a linear underestimator l{x) can be developed for f(x) in the general
case. Since the efficiency of the proposed algorithm should depend on the selection of the under
estimator, illustrations showing how tighter underestimators can be developed for an important class
of concave minimization problems. This includes the fixedcharge problem, the separable programming
problem, the quadratic problem, and the 01 mixed integer problem.
(i) General Underestimator l(x):
From the properties of the concave functions, a tangent hyperplane tof(x) at x [assume xeQ, where
Q is the convex polyhedron defined in (1)] overestimates f(x). Consequently, it appears plausible that
we can make use of a tangent hyperplane to g(x) =—f(x) (with modifications) to underestimate /(*).
Let t a (x) be a tangent hyperplane to g(x) at a given point.t Clearly, for any x,
(6) t!l (x)^g(x).
Now a transition from g(x) to f(x) can be made if g(x) *£/(*)• Unfortunately, this is not true in
general. However, if the values of x are restricted to those in Q, then the transition can be achieved
as follows:
PROPOSITION: Let M 3* be a real number, then there must exist a value of M < °° such that
(7) M + t g (x)^f(x), xeQ.
In this case/(x) = — M+t„(x).
PROOF: We need only show that —M + g(x) ^f(x) for all xeQ. The minimum (maximum)
value o(f(x)(g(x)) occurs at an extreme point of Q. If minf(x) 2 s then maxg(x) ^ and obviously
xtQ xtQ
the desired result is achieved for M = 0. Now, suppose m'mf(x) < 0, then max g(x) > 0. By assump
X€« XtQ
tion, f(x) possesses a finite minimum over Q. Thus M can be selected such that Ms* min/(x).
xtQ
Since, by symmetry, min/(a;) = max g(x) it follows that,
xtQ xtQ
M + g(x) **M+maxg(x) =£ =s M+ min/(x) *£ Af +/(*), xeQ.
xtQ xtQ
Now, since M 2s min/(*) can be taken arbitrarily large, then letting M 2* 2min/(jt), the desired
xtQ xtQ
conclusion follows immediately.
t We further assume that the tangent hyperplane is determined at x satisfying Vg(jE) ^ 0, where Vg(*) is the gradient
vector of g(x) at x. This will ensure that the resulting linear underestimator is not trivial.
CONCAVE MINIMIZATION 539
The above proposition actually implies that a linear underestimator for/(*), xeQ, can be taken as
/(*)= Y mjXjM,
pi
where rtij are positive constants and M ^  min/(*)  . If mmf{x) ^ 0, then M can be taken equal to zero.
xtQ xtQ
Notice that since the lower bound on M is obviously not known a priori, one must rely on some prac
tical estimate to determine a numerical value for M . Although any values of m, > can be utilized, fur
ther research is needed to determine the best set of values providing the tightest linear underestimator.
(ii) FixedCharge Problem:
In the fixedcharge problem, f(x) is defined by
f(x)=^CjX j + ^K j 8(x j ), N={1, . . .,«},
jfN jtN
where 8(xj) — if xj = 0, and B(xj) — 1 if Xj > 0. The coefficients c, and Kj are real numbers with Kj >
for ally. It can be proved that f{x) is a concave function which is continuous everywhere except at
x = 0. Now, since Kj > by assumption, it follows that
2 CM < 2 CjXj + 2 Kjd (xj) , xj s* 0.
jtN jtN jtN
This shows that the linear underestimator can be taken as
(8) l(x)=^CjXj.
jtN
Notice l(x) is valid for any Xj ^ 0.
The application of the above estimator will be illustrated numerically in the next section. Notice that
the same idea can be utilized to solve certain problems that often arise in inventory theory. A typical
example is the finite horizon, multipleitem model in which price breaks (or quantity discounts) are
allowed in the ordering function. This typically results in a piecewiselinear concave function. In this
case, the linear segments representing the smallest per unit ordering cost can be used to determine l{x) .
(iii) Separable Programming Problem:
Let
f(x)=2fj(xj)
where fj(xj) is a concave and well defined function. Suppose now that the feasible range for each
Xj as defined by the solution space Q is given by aj ^ Xj ^ bj, where a, and bj are known constants.
Let lj(x) = ajXj + [ij be the straight line joining the two points (a,j,fj(a.j)) and (bj,fj(bj)). Since fj(xj)
is concave, it follows by definition that
h(xj)^fj(xj), xjcQ.
540 H A  TAHA
Consequently,
(9) l(x) = f i l j (x j ).
It is noted that the fixedcharge problem discussed in (ii) satisfies the condition for a separable
concave problem. Consequently, the above linear underestimator can also be used with the fixed
charge problem. Notice, however, that the present underestimator is tighter than that defined in
(ii). This follows since it is defined for constrained values of x only as compared with x 2 s in the
fixedcharge problem.
(iv) Quadratic Minimization Problem:
Let
f(x)=z(x)+D(x),
where z(x) is linear and D(x) is a negative (semi)definite quadratic form. If D{x) = xBx T , then B is
negative (semi)definite. The linear underestimator in this case was developed by Gilmore [4] and by
Lawler [6] and was subsequently utilized by Cabot and Francis [3] in connection with Murty's algorithm.
Let bj be they'th column of B. Then D(x) can be written as
D(*)=y (xbj)xj.
Define Uj= min{xbj}. Hence
xtQ
(10) /(*)=z(*)+Y Ujxj^z(x)+D(x).
N
Notice that since Xj 3* 0, for ally, then, for any finite Wj =£ Uj, z(x) + V WjXj still provides a legitimate
Pi
underestimator. This result can sometimes be used advantageously to avoid solving n linear programs.
An application of this situation is given below.
(v) ZeroOne Mixed Integer Problem:
In this problem
/(*) = J cjxj, xj =(0,1) for yeA/ 1 C N.
je.V
The function f{x) can be written equivalently as,
/(*) = 2 <W + E (ci+M(lXj) )xj, M > and very large.
jcNNl jtNl
jfN jeNi jtNi
CONCAVE MINIMIZATION 541
where Xj > 0, jeN, and Xj =£ 1 , jeN 1 . The expression M ( 1 — Xj) assigns a very high penalty to Xj for <
Xj < l,jeN\ thus allowing it to take binary values only. The mixed integer objective function has been
equivalently converted into a quadratic function in which the quadratic form A/^£ (— x 2 .) is clearly
)tN\ J
negative definite. Notice that all the variables in the new form are continuous.
The above equivalence relationship was developed by Raghavachari [8] and independently by
Taha [10] in an effort to secure a simpler formulation for the mixed 01 integer problem.
The transformed f(x) is exactly in the same form as the function in (iv). Thus, using the method
in (iv), Uj,jeN l , is defined by
(11) Uj= min { — Mxj\xj€Q, «£ Xj *£ 1}
3= min { Mxj\0 *£ xj =£ 1} = M.
x i
Thus, taking Uj = — M, it follows from the development in (iv) that l(x)= V CjXj. This shows that
l(x) can be taken asf(x) after removing the condition xj= (0, l),jeN\ Notice that if Uj is determined
from the exact linear program in (11), the main difference would be that =£ x 3 < 1 for some jeN 1 ,
that is, Xj = 0. This means that the new l(x) will be the same as above except that the indicated Xj
are set equal to zero.
The above result can also be derived on intuitive basis. Since for the 01 mixed integer prob
lem the optimum must occur at an extreme point, the integrality condition can be replaced by the
continuous range s£ Xj =£ 1. It then follows that] ]£ cpc, 1 «£ x, ^ 1 , yWV 1 [must underestimate
I ^ c jXj\xj= (0, 1), jeN 1 \ since the former is less restrictive. This, incidentally, means that the trans
formation of/(x) given above does not yield any privileged information and hence is trivial.
Notice that by using the continuous range 0=£;tj=£l, jeN 1 , the resulting objective function be
comes linear in Xj over its feasible values. Thus the new objective function may be considered concave
over the feasible space and the general algorithm in section II becomes applicable. In this case the
upper bound /is defined equal to °° for any extreme point not satisfying Xj= (0, I), jeN 1  The impor
tant point, however, is that the cut as defined in section III is uniformly weaker than its equivalence
as developed by Balas [1]. On the other hand, the determination and use of Balas' stronger cut re
quires more complex computation as compared with ours. Consequently, the real merit of either cut
can only be checked through computational experimentation.
V. COMPUTATIONAL EXPERIENCE WITH THE FIXEDCHARGE PROBLEM
This section illustrates the efficiency of the proposed algorithm by applying it to the fixedcharge
problem. This special case is selected primarily because of its practical interest. In addition, the avail
ability in the literature of computational results for other fixedcharge methods allows a more meaningful
evaluation of the proposed algorithm.
In order to clarify the details of the algorithm, especially those associated with the degeneracy
problem, we first introduce a numerical example. This will be followed by a presentation of the com
puter results as applied to randomly generated problems.
542
EXAMPLE:
subject to
H. A. TAHA
where
minimize f{x) =<f> i (x t ) +(^2(^2)
2xi + x 2 + Si = 4
xi + xi + Si = 3
*£ x 2 ^ 5/2
Xi, Si, S2, ^ 0,
<M*i) =
4*z(xi) —
0, *, =
■4xi + l, xi >0
0, x 2 =
13*2+1/2, * 2 >0.
Thus, /(*) = — 4xi — 3*2. Using Dantzig's technique to accommodate the upper bound on x%, Table I
gives the solution specifying x°. A graphical display of the solution is given in Figure 1.
Table I
s,
s 2
1=
10
1
2
X\ —
Xz —
1
2
1
1
1
2
x°=(l,2); Point©
f(x°)=\0
f(x°) = 10+ (1 + y,)=8V»
/(x*)=/U°)=8V2
Cut #1 is now developed. (Notice that the determination of the constants (t*) of the cut must be
based on Dantzig's upper bounding technique.) Thus,
S*=min{ 1/1, ^p, 00 I = 1/2
S*=min{2/2, », oo = l
and the cut is given by
— + — ^1
V2 1
CONCAVE MINIMIZATION
543
CUT#
CUT#2
CUT#3
I 2 3 *l
FIGURE 1. Solution of the numerical example.
Expressed in terms of x\ and xz, the cut is
(Cut #1)
5*, + 3x 2 =£l0;
We denote the slack as S3.
Table II yields x l as a result of augmenting Table I by Cut #1 and reoptimizing using the dual
simplex method for upper bounded variables.
Table II
s 2
s 3
/=
9 l /2
3/2
1/2
x t =
1/2
3/2
1/2
x% =
5/2
5/2
1/2
s,=
1/2
1/2
1/2
Jt'=(V2. 5 /2); Point®
/(*')= 9V2
7(*')=9V2+(l 1 /2)=8>/(x*)
f(x*) =f(x°)
Notice that in Table II, Xi is basic at its upper bound. This means that the current solution is degen
erate. Using Balas' condition which, in this case, calls for ignoring the equations involving basic vari
544
H. A. TAHA
ables at upper bound or zero level, it is clear that the Adequation must be disregarded in developing
cut #2. Thus,
S*=min l~, oo, oo las 1
S* — min
fV2 1
i 00 00^ =
Ivi' ' J
This yields a new cut which when expressed in terms of x\ and x% is given by
(Cut #2)
6x, + 4jt2 + S 4 =12, S 4 2*0.
Table III gives the new solution after Cut #2 is effected. Notice that x% — 5/2 — x' r Notice also that
since S3 is associated with a previous cut and since it is basic, its corresponding equation can be dropped
in future tableaus.
Table HI
s 4
1=
8 5 /e
1/3
2/3
Xi =
1/3
2/3
1/6
s 2 =
1/6
1/3
1/6
s,=
5/6
1/3
1/3
s 3 =
5/6
1/3
5/6
X*=(V3, */a); Point (2)
/(.t 2 )=8 5 /s
7(jt 2 )=8 5 /6+lV2 = 
/(**)=/(,»)
•7V 3 >/(**)
Cut #3 is now generated from Table III. This gives
(Cut #3)
30x, + 24*2 «S 60.
The application of this cut will yield point (3) with x 3 = (2, 0) and/(* 3 ) = — 8. Since/(x 3 ) >/(**), the
process terminates. Thus x* = x° is the optimum solution.
Notice the effect of degeneracy at (7). Point (]) is (over) determined by the three lines jc 2 = 5/2,
*i + *2 = 3, and 6xi+4x 2 = 12. Balas' condition drops *2 = 5/2. The cone C as introduced in section
III, is then defined by the halfplanes x x + x 2 ^ 3 and dx x +4x 2 ^ 12, which yields Cut #2. The optimum
point (2) is a new extreme point which does not belong to the original solution space.
It is remarked that if the redundant constraint X\ + xi =£ 3 is eliminated instead of x% =£ 5/2, then
Cut #2 would have been stronger as it would pass through points ©and®. Stanley Zionts, in a pri
vate communication to the author, shows that by using the results in [11 J, this specific degeneracy
situation can be avoided. The idea is as follows: Prior to constructing a cut constraint, if there is any
degeneracy write the degenerate constraint so that the righthand side element is zero. (In [11], methods
for identifying redundant, (and of course redundant degenerate) constraints are provided.) Applying
this to Table II, xi is replaced by 5/2 — x' . In order for the redundant constraint to be implied in "defi
CONCAVE MINIMIZATION 545
nitional" form, x' 2 must now be made nonbasic with S 2 being the new basic variable. This yields Table
IV (Table II revised).
Table IV
/
9V2
3/5
4/5
Xi
V 2
3/5
1/5
s 2
2/5
1/5
s,
Vt
1/5
2/5
Notice that the S2row is redundant now and may be dropped from the tableau. But more impor
tantly, the generated cut is
^ + —3*1
5/2^5/2 '
which now passes through points Q) and®, thus bypassing the extra point (5) and its associated cut.
Computer Results
The testing of the algorithm as applied to the fixed charge problem is designed to check the effect
of the size of the problem and the magnitude of the fixed charge on the speed of computation. Random
problems of the type
max{^ (c j x j + Kj8(xj))\^a i jXj^bi,Xj^0,i=l, ... to}
are generated with their coefficients lying in the ranges
0«£c, ^999
*£ Kj ^ 160
20 *£ a i} *£ 100
O^bt ^200.
The sizes of the generated problem are given by (to X n) = (5 X 20), (5 X 30) , (10 X 20) , and (15 X 30).
In order to test the effect of the fixed charge, the same problems are used again with K, replaced by
2Kj and 3£,, respectively. No special structure is specified for the problems and the density of the matrix
d=  a*/ 1 is at least 97 percent.
The algorithm is coded in FORTRAN IV for the IBM 360/50. The results are summarized in Table I.
One of the basic difficulties we encountered in coding the algorithm was the control of machine
roundoff error. This is important since a zero variable may be rounded to a positive value, thus affecting
546
H. A. TAHA
TABLE V. Summary of Computation
(Time in seconds)
Problem
number
(mXn)= (5X20)
(mXn)=(5X30)
(mXn) = (10X20)
(mXn)= (15X30)
Kj
2K }
ZK }
Kj
2Kj
3K }
Kj
2Kj
3Kj
Kj
2Kj
3K,
1
0.300
0.500
10.70
0.484
0.513
0.683
64.917
65.117
150.317
40.600
44.150
702.500
2
0.216
0.250
0.250
0.366
0.467
23.650
22.833
22.950
127.950
9.184
11.400
11.167
3
0.250
0.283
0.450
0.467
0.483
0.483
1.417
192.117
192.117
59.000
66.183
72.350
4
0.216
0.717
1.717
1.530
2.633
3.116
70.650
77.117
81.980
3.650
417.183
702.500
5
0.317
0.317
2.783
0.866
0.750
1.550
1.550
2.000
33.967
19.984
135.433
803.117
Average
0.260
0.413
3.180
0.742
0.969
5.8%
32.273
71.86
97.266
26.482
134.868
327.552
the bounds directly. The problem was overcome by using doubleprecision computation as well as
appropriate tolerances. Also, checks were implemented in the code to detect the accumulation of ma
chine roundoff error. For example, an important check is to test whether at a given iteration the number
of positive variables among the original variables exceeds the number of original constrainst. It must
be remarked that the five problems in Table V were selected from among 20 test problems as the ones
yielding the least amount of "disorder" from the viewpoint of machine roundoff error. The remaining
problems were excluded by the checks in the code because they indicated uncontrollable roundoff
error. It is felt, however, that a professional programmer should be able to develop a more efficient and
accurate code than the one written by the author.
Although the results in Table I are generally compatible with what one may expect; that is, the
average computation time increases with the increase in the fixed charges, the individual problems
are exhibiting peculiar behavior which needs explanation. For example, problem 3— (10 X 20) requires
1.5 seconds for Kj, 192 seconds for 2Kj, and again 192 seconds for 3Kj. This result can be justified as
follows:
The termination of the algorithm occurs at the extreme point x' when l(x') 2* / (x*). It is obvious
that the computation time of the problem is primarily a function of the number of extreme points which
are ranked before termination occurs. Thus, two problems having the same solution space, will require
the same computation time if they terminate at the same x'. Notice that l(x) is dependent only on the
linear terms of the objective function and that its value at an extreme point is not dependent on the fixed
charges, while /(**) is directly dependent on the fixed charges. Consequently, if l{x l ) —/(**) for
2Kj is large enough to accommodate an increase in the fixed charges to 3Kj, termination still occurs at
x' and the same computation time is consumed. Similarly, if l(x') —f(x*) for Kj is too small, an in
crease in the fixed charges to 2Kj may necessitate further ranking of new extreme points before ter
mination is effected.
The results in Table I also show that the computation time increases more appreciably with the
increase in the number of constraints rather than with the number of variables. These results differ
from those associated with cutting plane algorithms in integer programming where the number or vari
ables is the main factor affecting the computation time. The reason for this appears to be that our
CONCAVE MINIMIZATION 547
algorithm depends more directly on the number of extreme points of the solution space which is a
function of both the number of constraints and the number of variables.
For the sake of comparing our algorithm with other exact methods for the fixed charge problem,
we only came across two algorithms by Bod [2] and Steinberg [9]. The two methods are of the branch
and bound type. Bod's method utilizes what may be termed as a partial enumeration technique for test
ing all the extreme points (basic feasible solutions) of the convex polyhedron. The effective use of bounds
on the objective value excludes most of the nonpromising extreme points. Steinberg's method, on the
other hand, initiates two problems at each node according to whether the variable Xj is zero or positive.
Bounds on the objective value are also used to effect the proper termination of the algorithm.
Bod does not present computer results for his algorithm. But Steinberg tests two sets of problem
with sizes (5 X 10) and (15 X 30) on the IBM 360/50. The average computation times per problem for
the two sets are 10 sec and 21.1 min, respectively. This is far inferior to the average computation time
obtained by our algorithm; esi> dally that Steinberg's algorithm can easily tax the computer memory.
He reports that a set of 15 blems, with size (5 X 10) each, requires an average of 32 nodes while
those with size (15 X 30) each require an average of 1,208 nodes. This shows that the number of nodes
can become very large even for problems, with modest sizes. The problem is not present in our algorithm
since, as in any cutting plane algorithm, the size of the matrix A at any iteration cannot exceed
(m + n)xn.
We must remark also that, contrary to our algorithm, Steinberg's algorithm becomes slower as the
magnitude of the fixed charge decreases. He utilizes the ranges =S Cj =£ 20 and s£ Kj =£ 999 for his
test problems, but does not study the effect of variations in Kj on the speed of computation.
VI. CONCLUSIONS
The algorithm presented in this paper is general in the sense that it can handle any concave minimi
zation problem over a convex polyhedron. If the computer results of the algorithm as applied to the fixed
charge problem are at all indicative of its efficiency, it would appear that the algorithm can actually
be used to solve practical problems. Further research is still needed, however, to develop the tightest
linear underestimator for f(x). Also, since degeneracy is a pronounced problem in our algorithm, a
general method is needed for treating the degenerate case without weakening the resulting cuts. This
should improve the efficiency of computation considerably.
VII. ACKNOWLEDGMENT
The author wishes to thank Professor Stanley Zionts, State University of New York at Buffalo, for
his helpful comments.
REFERENCES
[1] Balas, E., "Intersection Cuts — A New Type of Cutting Plane for Integer Programming," Opera
tions Research 19, 1939 (1971).
[2] Bod, P., "Solution of a Fixed Charge Linear Programming Problem," Proceedings of Princeton
Symposium on Mathematical Programming (Princeton University Press, Princeton, New Jersey,
1970), pp. 367375.
[3] Cabot, A. V. and R. L. Francis, "Solving Certain Nonconvex Quadratic Minimization Problems
by Ranking the Extreme Points," Operations Research 18, 8286 (1970).
548 H. A. TAHA
[4] Gilmore, P. C, "Optimal and Suboptimal Algorithms for the Quadratic Assignment Problem,"
SIAM Journal 10, 305313 (1962).
[5] Glover, F., "Convexity Cuts and Cut Search," Operations Research, 21, 123134 (1973).
[6] Lawler, E. L., "The Quadratic Assignment Problem," Management Science 9, 586599 (1963).
[7] Murty, K. G., "Solving the Fixed Charge Problem by Ranking the Extreme Points," Operations
Research 16, 268279 (1968).
[8] Raghavachari, M., "On the ZeroOne Integer Programming Problem," Operations Research 17,
680684 (1969).
[9] Steinberg, D. I., "The Fixed Charge Problem," Nav. Res. Log. Quart. 1 7, 217235 (1970).
[10] Taha, H., "On the Solution of ZeroOne Linear Programs by Ranking the Extreme Points,"
Technical Rept. No. 712, University of Arkansas (Feb. 1971) revised May 1972.
[11] Thompson, G. L., F. Tange, and S. Zionts, "Techniques for Removing Nonbinding Constraints and
Extraneous Variables from Linear Programming Problems," Management Science 12, 588608
(1966).
[12] Tuy, H., "Concave Programming Under Linear Constraints," Soviet Math 5, 14371440 (1964).
[13] Young, R. D., "New Cuts for a Special Class of 01 Integer Programs," Research Report, Rice
University, Texas (Nov. 1968).
ESTIMATION OF A HIDDEN SERVICE DISTRIBUTION OF AN M/G/oo
SYSTEM*
Laurence Lee George
University of Louisville
Louisville, Kentucky
and
Avinash C. Agrawal
University of British Columbia
Vancouver, B.C., Canada
ABSTRACT
The maximum likelihood estimator of the service distribution function of an A//G/°°
service system is obtained based on output time observations. This estimator is useful when
observation of the service time of each customer could introduce bias or may be impossible.
The maximum likelihood estimator is compared to the estimator proposed by Mark Brown,
[2]. Relative to each other, Brown's estimator is useful in light traffic while the maximum
likelihood estimator is applicable in heavy traffic. Both estimators are compared to the em
pirical distribution function based on a sample of service times and are found to have draw
backs although each estimator may have applications in special circumstances.
1. INTRODUCTION
Suppose customers arrive at a service system at instants T u T 2 , . . . T n , where {T,,} is a sta
tionary Poisson process with rate parameter A customers per unit time. Each customer is served upon
arrival and there are sufficient servers. Service times are independently and identically distributed with
some unknown distribution function G(t), f 2= 0. These conditions describe the A//G7°° service system.
They are often found in self service systems. In design of such systems it may be necessary to determine
the unknown service distribution. Direct observations on the service time for each customer that enters
the system may not be possible because of the economic constraints or because of other factors such
as introduction of unavoidable bias, or simply, the actual behaviour of the customers while in the system
is unobservable. An example of the first case may be cars entering a freeway where the distribution
function of the time spent by cars on the freeway is to be estimated and tracing each car individually
to find the time spent on the freeway may be extremely expensive. A similar situation may also exist
in any store where it may not be possible to follow each customer through the store. Another effect of
making direct observations on service time is to bias observations as customers may become conscious
of being observed. It is for these reasons that direct observations on service time may not be possible.
The service distribution, therefore, is hidden and estimation must be based on information other than a
sample of service times.
*This research was supported in part by the Defence Research Board of Canada Grant Number 9701 25, when the authors
were at the University of British Columbia.
549
550 L. L. GEORGE AND A. C. AGRAWAL
2. MAXIMUM LIKELIHOOD ESTIMATOR OF THE HIDDEN SERVICE DISTRIBUTION
WHEN A IS KNOWN AND OBSERVATIONS ON OUTPUT TIMES ARE AVAILABLE
Mirasol [5] shows that the output of an Af/G/oo service system is a nonstationary Poisson process
with
(2.1) Pr (number of departures in (0, t) = n\ system initially empty)
eWf.(KS> C{x)dx )«
= — , n = 0,l,. . .,
where G() is the common service time distribution function and A is the Poisson arrival rate. The
intensity function of this time dependent process, \G(t), is both nonnegative and nondecreasing and
is bounded above by A the Poisson arrival rate.
The likelihood function for a nonstationary Poisson process with ti, ft, . . . , t„ as the times of
occurrence of events is given by the joint density function
(2.2) /r„ m m (t\, tz, . . ., t„ ; \(t)) = Pr (observing events at t t , h, . . . t„ ; k(t))
= [nM^]exp(l\(*)<k),
where A(f) is the intensity function of the Poisson events. The first step in the problem under study
involves finding a function A(0, t ^ which maximizes the likelihood function given by Equation
(2.1) for fixed ti, ti, . . ., t n under the condition that MO, t ** 0, is nonnegative and nondecreasing.
The maximum likelihood estimate of M*)> t ^ Q satisfying these conditions has been obtained by Bos
well [1] as,
'=0 if0*5f<*i
< = min {Af y X(f *)} if **< t<t k+1 , k =1,2,. . ., (n  1)
= M < oo if t ^ t „,
where
(2.4)
and
\(tk)— max min \, ; ; \
UosH«3«nl («aT . . . ran)
ak = tk+i — t k , k = a, a+1,. . .,/8.
It may be noted that in the absence of an upper bound M on the value of M'K the solution obtained
will carry no meaning as (2.2) can be made arbitrarily large by setting k{t) — e > for t < t„ and setting
k(t„) arbitrarily large. Therefore, let k(t) «£ M for some fixed positive number.
HIDDEN SERVICE DISTRIBUTION 551
The maximum likelihood estimate MO for the function MO,* 3* 0, maybe used to estimate kG(t),
t 3 s 0, from output observations of an M/G/°° system during some interval [0, T], T > t. This will give an
estimate of A ■ G(t ) for te[0, T\. To obtain estimates for \ • G(t) for small t, the output process should
be observed for small t. For large values of t, the output becomes a stationary Poisson process at rate A.
For large values of t, G(t) is estimated as 1. If the input rate A is assumed to be known (this may be
estimated from input data) and it is also assumed that the system starts empty, then the maximum like
lihood estimate of the service distribution function G{t) is given by:
(2.5) G(0 = min[^,l]
In case it is desired to relieve the assumption that the system starts emtpy at t = 0, one must consider
the first outputs as order statistics from G(t), given the number in the system at fc=0 possibly mixed
with outputs which arrive after t = 0.
The hidden service distribution G(t) for an Af/C/« system may also be estimated by peeking at
the system only at times t t , t 2 , . . . t„ and observing the number in the system N(t\), . . ., N(t„).
This sequence may also be used for maximum likelihood estimation because the number of customers
in the M/G/<» service system is also a nonstationary Poisson process with intensity function X(l — G(t))
nonincreasing in t. The maximum likelihood estimator A(0 from (2.4) may be made into a maximum
likelihood estimator of a nonincreasing function by reversing the maxmin operation in (2.4). From this
estimate of Ml — G(t)), t 2= 0, one can obtain an estimate of G(t), t 2= 0.
Simulated output times of Af/Z)/ 00 and M\M\<x> service systems have been used in calculation of the
maximum likelihood estimators. Comparison of these estimates to the true service distribution and to
another estimator is made in section 4.
3. PROPERTIES OF MAXIMUM LIKELIHOOD ESTIMATOR FOR G(t)
The maximum likelihood estimator of KG(t) is a step function with jumps at output times,
7\ *£ r 2 *£ . . . ^T„. The first nonzero value or the estimate of G(t) occurs at or after 7\, giving no in
formation about G(t ) for t < 7\. This limitation may be removed by taking observations on N repeated
runs of the service system starting empty. The ordered output times for all runs are used in calculat
ing the estimator of G(t). The maximum likelihood property of this estimator still exists and the
estimator G,v(0 is given by,
(3.3) G N (t) = mm[k(t)INk,l],
where N is the number of runs.
A lower bound on the expected time of the first output from Af runs is the expected first input time
in W runs 1/A7V. Extreme value theory suggests that asymptotically the time of the first observations
on G(t) will become smaller as the number of sums increases. Let D t j be the departure time of
the ith customer in the 7th run where i—1, 2, . . . n,j=l, 2, . . . N. Dij=Tij + Sij where 7\, and
Stj are arrival and service, respectively. The first departure over N runs will take place at time given
by min {Ty + Sy}. By extreme value theory, min {7y + Sy} will have Weibull distribution asymptot
i &j i &j
ically, Gumbel [3], no matter what may be the distribution function for the random variable (7y + Sy)
552 L. L. GEORGE AND A. C. AGRAWAL
provided (Tij + Sij) > 0. The expected value of a random variable x having Weibull distribution is
given by
(3.4) £(*) = <* r(l + l//3),
where a is the scale parameter and /3 is the shape parameter for the Weibull distribution. The scale
parameter a can be estimated as the mth order statistic (m counted from bottom) for which
(3.5) l(m/Ar+l) = l/e= 1/2.718 ....
As N increases m will increase which means the value of a, the scale parameter, will decrease.
Thus the expected value given by Equation (3.4) will go to zero asymptotically for large values of N.
In the context of the service system, this will mean that the expected time of first departure will asymp
totically decrease to the lower support of the distribution as the number of runs increase. In other
words the mean of the minimum order statistic of a random variable is of the order of the quantile for
which the probability value is I— — J and thus will decrease to the smallest possible value of the
random variable asymptotically with/V.
A simple illustration can be given by considering service time to be a constant, to. The expected
time of the first departure in N runs, each run with n observations is given by
(3.6) Em\n{T ij + S ij } = E{min (Tij) + t }
i&j
= £{min (Tij)} + t
= (ll\Nn) + t ,
where A is the arrival rate of a Poisson arrival stream.
It can be seen from (3.6) that the expected time for the first departure in case of constant service
time to converges to the lower bound of the support of the service time distribution faster than 1/(N+ 1)
as long as n > 1.
The proof by Marshall and Proschan [4] of strong consistency of the maximum likelihood estimate
of a distribution function under the assumption of increasing failure rate may be applied to show that
the maximum likelihood estimator of an increasing distribution function G{t) is strongly consistent at
the points of continuity; i.e.,
(3.1) G N (t)=G(t),
with probability 1, for a sufficiently large number of repeated observations N on the service system out
put starting empty. This may be done because the failure rate function r(t) given as
HIDDEN SERVICE DISTRIBUTION 553
for an increasing failure rate distribution function F{t) corresponds to a nondecreasing intensity func
tion k(t) of a nonstationary Poisson process. In fact K(t) is the failure rate function of the distribution
function of the event times conditional on previous event times. The maximum likelihood estimate of
A(f) based on event times of a nonstationary, nondecreasing intensity Poisson process is the same as
the maximum likelihood estimator of the failure rate function r(t) from a nondecreasing failure rate
distribution.
4. NUMERICAL RESULTS AND COMPARISON WITH BROWN'S [5] ESTIMATOR
4.1 Brown's Estimator
In Figure 1, the number of customers in the system N(t) is plotted against time t, Let the origin
on the time axis be shifted to the right such that it coincides with the first output after the old origin 0.
6j ,
n H* Z 4 *
5  *t— z 3 *h
*»Z*H *»Z 2 **
« \ M TIME f
t Y 3 M
4 Y 4 — t
^ "b w
FIGURE 1. Number of units in the system N(t) at time V vs. timet.
Yi, i= 1, 2, . . . , n is the time between the new origin 0' and the ith output point after the new origin.
Zi is the time from Yi to the nearest input point prior to Yi. For a stationary input process and in
dependent identically distributed service times in steady state behavior, the Z, are independent
and identically distributed. Let //(•) be the distribution function of Z,, £=1, 2, . . ., n and H n (')
the empirical distribution function based on observations Z \,Zz, ■ ■ ,Z n .
Then,
Pr[Zi > *] = Pr[time back from ith output to last previous input > x]
= (lH(x))
= Pr [no input in the interval of length x fl service takes longer than x]
= e^'(\G(x))
or
lH(x) = e k '(lG(x)).
554
L. L. GEORGE AND A. C. AGRAWAL
Thus, an estimate of H{ • ), given by H„(x), may be used for estimating the service distribution
function G(x):
(4.1)
G(x) = le^(lH„(x)).
This estimate may not be nondecreasing. A nondecreasing estimate of G(x) is obtained by modifying
Equation (4.1),
(4.2)
G„(x) = max[0, j=i2 max  .^ ) {le Xzi (l^„(Zi))}].
4.2 Numerical Results
The maximum likelihood estimate of the hidden service distribution function G(x) was obtained
for simulated operation of an M/M/°° system for various arrival rates X and service rates /a. The results
shown in Figure 2 correspond to an arrival rate of \= 1 customer/ min and exponential service at rate
10
09 N = 5,n = 50
08
07
^ 06
0.5
0.4
0.3
02
01
0.0
MLE
EXPONENTIAL SERVICE /i = I PERS0N/MIN
ARRIVAL RATE X =1 PERSON/ MIN
i
EMPIRICAL
MAXIMUM LIKELIHOOD
EXPONENTIAL
10
14
20
FIGURE 2. Maximum Likelihood Estimate (MLE).
fx=0.1 customer/min. Simulation was carried out for five runs, each consisting of 50 observations. The
estimated values G\(x) are plotted against output times T, for 1=1, 2, . . ., n. The empirical distri
bution function as well as an exponential distribution function for the service times are also plotted for
the purpose of comparing the simulated results. It can be seen that the estimated distribution function
is close to the empirical and the actual distribution function. Brown's estimator was simulated for an
M/M/oo system in steady state with A = 1 customer/min. and exponential service at rate £(. = 0.5 customer/
min. Results are shown in Figure 3 and it is found that Brown's estimated distribution function is close
the empirical as well as actual distribution function.
Further simulations with different exponential service rates has shown that while Brown's method
gives reasonable results for a system having service rate close to or larger than arrival rate the maximum
likelihood estimator is useful for slow service rate systems having large numbers of customers in the
HIDDEN SERVICE DISTRIBUTION
555
•o
1
1.0
09
08
3R0WN'S ESTIMATOR
ARRIVAL RATE X = 1 PERSON MIN
SERVICE RATE /a= 5 PERSON MIN

•^
,<*:..
i
r
.^0

0.7
0.6
0.5
^
J"""
04
.
^f^^
EXPONENTIAL
0.3
0.2
1
... J ^'\ ""EMPIRICAL
BROWN'S ESTIMATOR
0.1
n n
i i
i
1 1 1
02 0.4 0.6 08 10 02 14
Z
Figure 3. Brown's estimator.
18 20
system. This contrasting behaviour of the two estimators may be used in order to obtain better results
by using Brown's estimator in case of fast service and maximum likelihood estimator in case of slow
service.
Simulation of estimators was also performed for constant service times. The same remarks as
above apply to the usefulness of the two estimators relative to service rate. It was also noted that the
maximum likelihood estimator converged to the true unknown service time from above while Brown's
estimator was less biased.
REFERENCES
[1] Boswell, M. T., "Estimation and Testing Trend in a Stochastic Process of Poisson Type," The
Annals of Mathematical Statistics, 37, 15641573 (1966).
[2] Brown, M., "An Estimation Problem in Af/G/<» Queues with Applications to Traffic," Technical
Rept. No. 59, Department of Operations Research (Cornell University, Ithaca, New York, 1968).
[3] Gumbel, E. J., Statistics of Extremes (Columbia University Press, New York, 1968).
[4] Marshall, A. W. and F. Proschan, "Maximum Likelihood Estimation for Distributions with Mono
tone Failure Rate," The Annals of Mathematical Statistics 36, 6977 (1965).
[5] Mirasol, N. M., "The Output of an M/G/<» Queuing System is Poisson," Operations Research 11,
282284 (1963).
THE SINGLE SERVER QUEUE IN DISCRETE
TIMENUMERICAL ANALYSIS HI
Marcel F. Neuts * and Eugene Klimko
Purdue University
ABSTRACT
This paper deals with the stationary analysis of the finite, single server queue in dis
crete time. The following stationary distributions and other quantities of practical interest
are investigated: (1) the joint density of the queue length and the residual service time,
(2) the queue length distribution and its mean, (3) the distribution of the residual service
time and its mean, (4) the distribution and the expected value of the number of customers
lost per unit of time due to saturation of the waiting capacity, (5) the distribution and the
mean of the waiting time, (6) the asymptotic distribution of the queue length following
departures.
The latter distribution is particularly noteworthy, in view of the substantial difference
which exists, in general, between the distributions of the queue lengths at arbitrary points
of time and those immediately following departures.
1. INTRODUCTION
This paper is a direct sequel to [2], to which we refer for a detailed definition and for the assump
tions of the finite, discrete time queue. For easy reference, we only give a summary of the notation here.
NOTATION
L\ Maximum number of customers allowed in the system at any time. All excess customers are
lost and do not return.
Li Maximum duration of the service time of a single customer.
Tj Probability that a service lasts for; units of time, 7= 1, . . ., L 2 . We assume without loss of
generality that tl 2 > 0. Also n + . . . +r/. 2 = 1.
K Maximum number of arrivals during a unit of time. It is assumed that K<L X .
Pj Probability thaty customers arrive during a unit of time, j = 0, 1, . . ., K. We assume without
loss of generality that po > 0. and p* > 0. Also po + . . • +Pa ; — 1.
X n The number of customers in the system at time n+.
Y n The number of time units until the customer in service at time n+ completes service. We note
that O^Yn^Lz and that Y n = if and only if X n = 0.
In [2], it was shown that the bivariate sequence { (X„, Y„), n 3= 0} is an irreducible, aperiodic Markov
chain with state space { (0, 0)} U {(1, 2, . . .,Li)X(l, . . ., L 2 )}. Its transient behavior was dis
cussed and investigated numerically in [2]. In this paper we first discuss the stationary joint distribution
of the queue length X„ and the residual service time Y n .
*The research of this author was supported by the National Science Foundation, Contract No. GP 28650
557
558 M. F. NEUTS AND E. K1JMK0
2. THE EQUATIONS FOR THE STATIONARY JOINT PROBABILITIES OF X„ AND Y„.
We denote the stationary probabilities by P(i,j) for i= 1, . . . , L\ and / = 1, . . ., L%\ P(0, 0)
is the stationary probability that the queue is empty. The stationary joint density of X„ and Y n is the
unique solution to the following system of linear equations.
(Da. P(0, 0) = p [P(l, 1) + P(0,0)].
for 1 ^i*£ K, 1 <y<I s l.
c. P(*\./) = x Pr^(^y+l) + ';'•,.; /'('" ^),
for K + 1 «s i i *£ U  1, 1 =£ j =S L 2  1.
d. P(L t ,j) = P(i,,y+ 1) + 5; (1  x p*) Wj  vJ+ i) + 'v,; p(£i, u\
v=\ Jt =
for 1 ^y '^ L 2 l.
P(i, U) = r,. t f^P(l, 1) + V P*, + i />(*, 1)1,
11  Po „„
for 1 *s i < A:.
i>=iA'+l
P(i*L 2 ) = r Lt X p,_„ +1 />(*,!),
forK+1 <»<Lil.
P(L,,L 2 ) = r, 2 2 flJpt)^^!,!),
„=1 v fe=o
/1 /.j
h. P(0,0) + ]T 2P(i,j) = l.
i=l j=l
The system (1) contains L t L 2 + 1 independent linear equations in L1Z.2 + 1 unknowns. We shall show
that its solution may be conveniently expressed in terms of the solution of a homogeneous system of
SINGLE SERVER QUEUE III 559
L\ equations in L x unknowns. Moreover, the latter system has a particular structure which greatly
simplifies its numerical solution.
We denote by Pj the Lituple [P{l,j), . . ., P(L u j)] forj=l, . . ., L 2 . We also introduce the
L\ X L\ matrices A and B defined as follows:
Po P\ Pi
po Pi
po
Pk
Pki Pk
Pk2 Pki
Pk3 Pk2
. .
.
. .
. .
. .
P2
P3
1— po 1— po 1— Po
PO Pi P2
PO Pi
po
Pk2 Pki Pk
Pk3 Pk2 Pki + Pk
Pi Pz
Po Pi
po
^—
1— Po
Pki Pk
Pk2 Pki
Pk3 Pk2
1— Po— Pi— p
1popi
1— Po
1
B
Pk2 Pki Pk
Pk3 Pk2 Pki + Pk
p 2 p 3
pi p 2
po Pi
po
In terms of A and B, the equations (16 — #) may be written as
560 M. F. NEUTS AND E. KLIMKO
(2) Pj^Pj+iA + rjri^, 1*/«I,1,
P L = nJPiB.
The latter system is equivalent to the equations
(3) Pj^P^rwfJ, l^j<L 2 l,
v=j
U
v= 1
We now observe that both A and B are stochastic matrices, that A is upper triangular and that the
matrix B has only one subdiagonal. We shall say, for brevity, that B is nearly upper triangular. Since
r, + . . . +ri, 2 =l, and A is an upper triangular stochastic matrix, the matrix V r v A v ~ x is stochastic
v= 1
and upper triangular. The stochastic matrix B is irreducible, so that the matrix
(4) Q=2r*4" l B,
is irreducible and stochastic. Finally it is easy to verify that Q is nearly upper triangular.
The vector P/. 2 is therefore proportional to the vector of the stationary probabilities of the matrix Q.
The nearly upper triangular form of the matrix Q makes the numerical computation of the vector
Pi. 2 — up to a positive multiplicative constant — particularly simple. The vector Pl 2 is proportional to the
vector (ti, ti, . . ., fc,,), whose components may be computed recursively as follows
(5) «i = l,
t 2 = (1— 9ll)92~l''
t k
= q k ~t k _ 1 Uki(l—qki,ki)—'2 t t v q v , k j , 3=££s=L,
It is easy to verify that none of the entries qk.ki,2 ^ k^ L x vanish, so that by using the first equation
in (2), the vectors Pj, j=l, . . ., L% — 1, may be computed up to a common, positive multiplicative
constant. Equation (la) is then used to determine P(0, 0) up to the same multiplicative constant. This
constant may finally be computed using Equation (lh). The stationary joint density of the queue length
and the residual service time is therefore determined.
SINGLE SERVER QUEUE III 561
3. THE STATIONARY DENSITY OF THE WAITING TIME
The support of the stationary density {tVj} of the waiting time consists of the integers 0, 1, . . .,
L x Lz. Clearly w o = P(0, 0) and for 1 «£./'«£ LiL 2 , the density may be written symbolically as the con
volution polynomial
(6) {wj} = P(l,)+P(2,')*{r v } + P(3,)*{r v }^+. • . + P{Li, •)•{*}«*»,
where {r„} is the density of the service time.
The numerical computation of the uij, l^j^ L\L 2 , by using a convolution analogue of Horner's
algorithm for polynomials was discussed in [2].
4. THE STATIONARY DENSITY OF THE NUMBER OF LOST CUSTOMERS PER UNIT
OF TIME
Since the waiting room is finite, it is possible that customers will be lost due to the waiting room
being full at their arrival time. It is therefore of interest to know the stationary density {<£,} of the
number of lost customers per unit of time. It has its support on the integers 0, 1, ... K and may be
determined by the explicit expressions
(7) <pj=2 Pk %P(L x k+j,p), l^j^K,
A
(pO=l^ Ipj.
Knowing the joint density discussed in section 2, the probabilities {<pj} are readily computed.
5. THE STATIONARY DENSITY OF THE QUEUE LENGTH AT DEPARTURES
The probabilities associated with the queue length at departure times, are primarily of interest
in the analytic treatment of queues of MG1 type. Although they are frequently examined, their in
herent applied interest is limited.
As we shall indicate below, the density of the queue length following departures may easily be
obtained from auxiliary quantities which are computed in the process of evaluating the joint stationary
density, discussed in section 2. In view of the importance ascribed to this density in the applied queueing
literature, we decided to investigate its computational aspects. Note the very substantial difference
which may exist between it and the stationary density of X n .
The queue lengths following departures form an irreducible, aperiodic Markov chain with state
space {0, 1, . . ., L\— 1}. Let us denote its transition probability matrix by T. Furthermore, let 6k(i, v)
be the probability that in k consecutive units of time during which no departures occur, v customers
join the queue, given that the queue length at the beginning of the first unit of time was i.
The entries of T are then given by
(8) To } = '2 i r k y t pK(lpo) 1 k (h,jh+l), forO<;<Iil,
A=1 A=1
562 M. F. NEUTS AND E. KLIMKO
1.2
Tu= £ r k O k (i,ji+l), for 1 *£ i^j+ 1,
A = 1
7\j = 0, for i>j+ I.
We note that the transition probability matrix T is nearly upper triangular. The stationary proba
bilities corresponding to T may be calculated by a simple recursion such as in Formula (5). In order
to evaluate the entries of the matrix T, we first show that
(9) e,(iJi+l) = (A k )i, i+l , forl^i*£L 1> 0*£y^L,l,
where A is the upper triangular matrix defined in section 2.
For k—1, we find that
(10) i (iJi+l)=p J u. i , torO^ji+l^KJ^Li2,
K
= 2 P»> for L x K^i^ Li, j=Ltl,
v=l.,i
= 0, for all other pairs (i,j),
so that Equation (8) holds for k= 1. Furthermore
(11) 6 k + l (i,ji+\) = Y k (i,vi+l)pj p ,
v= maxTo, jK)
for 0«y=sL,2, and
'. K
dk + i(i, Li— i) = V 6 k {i,v — i) V p/,,
P=T, A' h = L t —v
lor 1 =S i =S L. When expressed in terms of the matrix /4, Formula (11) proves (9) inductively.
The matrix T can be compactly written as
(12) T=cf t r k A",
where Cy=pj(] — />o) _1 for 1 =£y '^ K; Cj_i,,= 1, for 2 *£ i =£ Li, and Cy = 0, for all other pairs (i,j).
The relation between the limiting distribution of the queue length following the nth departure
and the stationary queue length distribution is noteworthy. A wellknown theorem, from Reference
[3], states that in a stable M \G\\ queue with single arrivals, the queue length at time t and the queue
length following the rath departure have the same limiting distribution as t and ra respectively tend to
infinity.
An analogous result holds for
provided that the probability of ;
by an exact analogue of the argumenl
In the case of group arrivals (K
tions. Theorems which relate those
processes, hut the resulting formi t
offer as an illustration some numeri
customers.
We considered a queue with L t
Although the traffic intensity p for th
shows that this queue converges very
The limiting distribution of the >
In contrast, the limiting distribution
addition, we list a summary of the nui
ary probability of at mosl k custoi
following a departure from the system
563
ussed by Dafermos and Neuts [1],
I time is zero. This result is proved
we shall omit the proof.
en those two limiting distribu
using the theory of Markov renewal
not pursue this topic here, but we
i has rare arrivals of large groups of
Po = 0.975, p>o = 0.025, r, = r 2 = 0.5.
xamination of the transient behavior
'
leparture has a mean equal to 32.2864.
time n has a mean equal to 24.1752. In
ttionary ctatributions. 77* is the station
ary probability of at most A: customers,
1
30
1
0.91
:
The greater limiting probability
paradoxical at a casual reading. A m
be anticipated in stahle queues with
be zero for long intervals of time becaus<
flowing departures may appear to be
owever that, on the contrary, this is to
example, the queue length will typically
». The averaging procedure involved in
the stationary distribution of the queue lei n heavily favors the lower values of k. The
limiting distribution of the queue length folio be rath departure effectively ignores the long idle
periods and results primarily from t!i during the service of the large groups of
customers. The high probabilitiei of k in this distribution are therefore not
surprising. ...
This example strikingly shows that ion of the queue features may be of
limited practical value, even in very stable lost realizations of the queue length process in
our example will exhibit very substantia' . reflected in the asymptotic distri
564 M. F. NEUTS AND E. KLIMKO
butions. The practical questions related to queues of this type can only be answered after analyzing
their transient behavior. The exclusive concern with asymptotic results in "practical" discussions of
queueing theory is therefore regrettable.
6. COMPUTATIONAL ORGANIZATION
In order to minimize both the computation time and the required memory storage, we took ad
vantage of the highly structured form of the matrices Q and T in Equations (4) and (11), respectively.
The basic matrix is the upper triangular polynomial matrix Q* — {q*j}
(13) Q*=^t rA v ~\
The rows of this matrix are similar in the sense that
(14) qt,i+v = q£ v +i
fori/=0,l,2, . . . L,— i — 1; i = 2,3, . . .,L,1.
Furthermore the matrix Q* is stochastic, so that
(.5, r,^ff
Therefore, the first row determines the entire matrix. This permits the storage of 0* using only L\
memory spaces, rather than the (Ljf + Li)/2 spaces required for an arbitrary upper triangular matrix.
The resulting saving in memory space is substantial for large queues and in fact makes the analysis
of queue lengths up to 800 feasible. Computation of the matrix Q* is performed by using Horner's
method for the formation of polynomials, i.e., by recursive computation as follows
(16) Q 1 *=(r^ + r f . 2 _,7)
Q*n=Q*n>A + r L2  n I n = 2, . . .,1,1.
Each of the successive matrices Q * is completely determined by its top row. The rightmost elements
are not needed and therefore are not computed. The top row entries of Qt are rapidly calculated by
means of the formulas
mln (K,j)
(17) 0# B+1) =yprf i ! ( f )+ n<.«P* iorj^k
i =
min(K,j)
<7u (n+1) =2 Pi&V, for K<j^L u
1 =
The matrix Q has the form
SINGLE SERVER QUEUE III
565
qu Q12 9i3 qu
921 ?22 923 924
921 922 923
q 2 i 922
9i,t,4 9i,*i3 9i,t,i
92,/,3 92,L,2 92,t,l
92,Li4 92,f,,3 93,L,1
92,L,5 92,t,4 94,til
921
922
9^12, L,l 9i,2, t,
921
9^ii, /ii 9t,i,t,
qL t ,L,l qLj.Lt
where the third through the last rows, except for the last two columns, are essentially repetitions of the
second row. The last column is determined by the condition that the rows sum to one. We therefore
need to compute and store only the first and second rows and the (Li — l)st column. This requires
3Li— 4 memory cells for the storage of the Q matrix rather than the L t + (Li + 2) (L t — l)/2 required
for an arbitrary nearly upper triangular matrix. The top row elements of Q are given by
(18)
9u :
lpo 9n
min(K,jl)
i=0
for j^K
2 P^tinv
<=o
for K<j<U\
I til \ K
9i,/.,i = Po ( !~ X 9 u ) + S Pilt^r,
* j=l ' i=l
the second row elements of the () matrix are
(19)
min(/c,jl)
92i= X P<ii' forl*£,=£Li2;
<=0
and the (Li — l)st column elements are calculated by using
(20)
9i,/.ii — Po
('S'l)
min (K,Lii)
+
Pfc9i,/c,fc,
fc=i
for 2*Si*£ U.
The stationary probabilities of the Q matrix were determined using Formula (5) and its compact
representation by Formulas (18)(20). For this purpose, a subroutine called STAPROB was written. The
resulting stationary probability vector was identified temporarily with the vector Pl 2 . The vectors
566
7Y 2 , ■ ■ • , P\ were successively obta this computation essentially only the top
row of the matrix A is needed. The nuil
(21) P(i L z ),
for./=/.2l !•
Finally /'(0, 0) is con tali'zation condition
(I/O.
The waitingtime distrib I ubroutine called WAIT. This
subroutine was adapted from the >es where L,L 2 is large, one may wish
to print only the percentage points of th A routine to do this was also written.
The computational procedun leparture is similar to that for the
stationary queue length distribution.
Li
(22)
is first computed and then the matrix Tis nled in a manner similar to that of the
matrix Q. Only a modicum of additii solved. The stationary distribution is then
calculated by the suhroutine STAPROB.
Testing
In addition to testing the program ' c compared the stationary probabilities
with the transient probabilities after 60 r were obtained by the methods developed
in [2].
Computational Experience
Practical limits on the pro! memory requirements. The available
memory space of J50K octal oximately. This permits, for instance,
queue lengths of size 800 with service 5 points. For problems of this magnitude
the computation time was a limiti n of the waitingtime distribution. We
ran examples, both with and without ngtime. The central processing times
on the CDC 6500 at Purdue Univer shown in Table 2. 7'i and T 2 are the
actual program running times in seconds (withoc and loading times), respectively, with and
without the computation of the waitingtime di ». For the example with Li = 800, Z. 2 = 25,
/C = 4, the time 7\ was in excess of 3,000 see. I the computations were not completed even then.
In all the examples, we used the same arrival distribution po = 0.8, pt — ih — P3 = p4 — 0.05. The service
time distribution for the first three examples wai ,==0.05, and r 5 — 0.175. In the
last example, the service time distributi ometric with p = 0.5 and the residual
probability was added to / ?:,.
SINGLE SERVER QUEUE HI
Table 2
567
L,
U
K
T,
T 2
100
5
4
5.751
0.945
200
5
4
22.539
2.221
400
5
4
69.774
6.612
800
25
4
> 3,071.032
26.290
7. CONCLUSIONS
Large discrete, single server queues in the stationary phase may be analyzed numerically. As we
have shown, most queue features of interest, with the possible exception of the stationary waitingtime
distribution, can be computed without the use of excessive processing times. This should be contrasted
with simulation methods which are inherently illsuited for the study of the stationary phase.
The prohibitive processing times required for the waitingtime distribution in large queues, raise
the interesting question of how to evolve efficient numerical procedures for the evaluation of expressions
of the general type
which appear frequently in stochastic models of varied applied interest.
Finally, the example discussed in section 5, shows that in queues exhibiting large fluctuations, it
may be hazardous to base conclusions on a single stationary distribution. In such cases one should study
the transient behavior, whenever possible.
For further information on the algorithms discussed in this paper, one may contact either of the
authors at the Department of Statistics, Purdue University, West Lafayette, Ind. 47907.
REFERENCES
[1] Dafermos, S. and M. F. Neuts, "A Single Server Queue in Discrete Time," Cahiers du Centre de
Recherche Operationnelle 13, 2340 (1971).
[2] Neuts, M. F., "The Single Server Queue in Discrete Time — Numerical Analysis I," Nav. Res. Log.
Quarterly, 20, 297304 (1973).
[3] Takacs, L., Introduction to the Theory of Queues (Oxford University Press, New York, 1962).
SOME EXPERIMENTS IN GLOBAL OPTIMIZATION
James K. Hartman
Naval Postgraduate School
Monterey, California
ABSTRACT
When applied to a problem which has more than one local optimal solution, most non
linear programming algorithms will terminate with the first local solution found. Several
methods have been suggested for extending the search to find the global optimum of such a
nonlinear program. In this report we present the results of some numerical experiments
designed to compare the performance of various strategies for finding the global solution.
I. INTRODUCTION
It is frequently the case in applied optimization studies that an algorithm which is known to con
verge to a global optimal solution under certain conditions (such as convexity) will be applied to a prob
lem which does not satisfy these conditions. In particular, optimization problems which are suspected
of having several local optima in addition to the global optimum are often solved using algorithms
which will stop and indicate a solution whenever any local optimum is reached. In such cases a useful
strategy is to repeat the solution process several times starting from different initial points to avoid
accepting a solution which is only a local optimum. This is probably the most frequently suggested
strategy for avoiding local solutions.
There are also other strategies for avoiding the local solutions in favor of the global optimum. This
paper describes some numerical experiments which were done to compare the performance of several
strategies for organizing such a global optimization.
II. THE PROBLEM
In order to develop and test strategies for avoiding local solutions it is necessary to specify a class
of optimization problems to be considered. This paper will concentrate on the "essentially uncon
strained" nonlinear programming problem
(1) minimize f(x)
subject to xeScE",
where the local and global optimal solutions to (1) are known to occur in the interior of the set S. In
such a problem the feasible region S determines a domain to be searched for solutions, but the bound
aries of S do not determine the solutions. In this sense problem (1) can be considered "essentially
unconstrained." The simplest way to specify the set S is to place upper and lower bounds on each
variable. Since each of the strategies to be considered will involve random selections of x, it is necessary
to confine the search to a bounded region S. In addition, search strategies S5 and S6 will partition S
into smaller regions; these two strategies can only be conveniently described for S determined by
upper and lower bounds on the variables.
569
570 J K. HARTMAN
''Essentially unconstrained" problems arise frequently as the "unconstrained" subproblems in
interior penalty function algorithms such as the Sequential Unconstrained Minimization Technique of
Fiacco and McCormick 3J. In the SUMT method, if the original nonlinear program is not a convex
program, then the subproblem (1) may have local solutions which are distinct from the global solution.
For problems like (1) a local optimal solution can be obtained by applying any of the efficient
unconstrained descent algorithms (such as the DavidonFletcherPowell method) to minimize the func
tion /(.x) while being careful not to penetrate the boundary of S. We shall now consider several strate
gies which try to ensure that the local solution we finally accept is, in fact, a global minimum.
III. STRATEGIES FOR AVOIDING LOCAL SOLUTIONS
Six different strategies for organizing a global optimization are compared in this paper. These are
briefly described below with references to more complete descriptions when they exist.
STRATEGY SI (From the folklore):
a. Set k= 1.
b. Let x k be a vector chosen at random in the search region S. Starting at x k perform an uncon
strained minimization search on the function/^) terminating at the local minimum x k *.
c. Replace k with A: + 1 and go to step b. At each stage retain the best local solution obtained to
date.
SI is the strategy suggested in section I. Intuitively the problem with this strategy is that it may re
peatedly search to the same local minimum if the starting points x k happen to be chosen within the
"range of attraction" of that local minimum. The next three strategies attempt to solve this problem.
STRATEGY S2:
a. Set k=\. Let/* be the objective function value for the best local solution so far obtained.
Initially /*= + °°.
b. Randomly select points xeS until one is found with f(x) </*. Call this point x k .
c. Starting at x k perform an unconstrained minimization search terminating at a new local mini
mum x k *.
d. Set/* —f(x k * ) , replace k with k + 1 , and go to step b.
In S2 a minimization (step c) is initiated at x k only Hf{x k ) is smaller than the best solution/* found
to date. Hence, each successive minimization gives a new local minimum which is better than any found
so far. The same local minimum cannot be located twice. It is, however, much more difficult to deter
mine the starting points x k for strategy S2 than for SI.
STRATEGY S3 (Bocharov [1]):
a. Choose x 1 randomly in S. Set k=l.
b. Starting from x k perform an unconstrained minimization terminating at the local minimum
c. Choose a direction vector d k €E" at random and consider/^* * + ad A ) as the positive scalar
a increases. Moving away from x k * in direction d k , the funtion/must initially increase (since x k *
is a local minimum). Continue to increase a until/begins to decrease when a=a k .
d. Let x k+1 = x k * + a k d k , replace k with k+ 1, and go to step b.
STRATEGY S4 (Bocharov [1]):
S4 is the same as S3 except that in step c, instead of choosing the direction at random, d k is
GLOBAL OPTIMIZATION 571
chosen to be the direction of overall progress from the most recent minimization
(2) d k = x k *x k .
Both S3 and S4 attempt to prevent repeated minimization to the same local optimum by moving out
of the region of attraction of the most recent local solution before starting the next minimization. By
continuing in the direction (2), Strategy S4 hopes to also avoid local minima detected before the most
recent minimum.
Strategies S5 and S6 are considerably different from the first four methods. While S1S4 attempt to
choose good starting points for repeated local minimizations, S5 and S6 attempt to gain information
about the entire search region S, gradually concentrating their attention on portions of S which are, in
some sense, "likely" to contain the global minimum. S5 and S6 are most easily described for problems
where S is determined by lower and upper bounds on each variable:
S={xeE n \lj^x i ^L i , i=l, . . .,n}.
For ease of presentation we will restrict our attention to such problems.
STRATEGY S5 (Piecewise Coordinate Projection — Zakharov [5]):
a. Set up on initially empty list of points, and let S = {xeE"\li ; =£ %\ i =£ L,, i=l, . . ., n} be the
"remaining feasible region." Let S = S initially.
b. Randomly choose N points x k eS, compute/^* 7, ) for each, and adjoin them to the list.
c. For each coordinate x, of x(i=l, . . ., n) separate the remaining feasible interval [/,, L,]
into m equal subintervals. Let Xij—{x k in the list whose ith component is in the _/th subinterval of
[/,, Li]} = {x k \(jl)(Lil i )lm^x k h<j(L i h)lm} for i=l, . . ., n and./=l, . . ., m. Then
Xn, Xii, . . ., Xi, n describe the projection of the list of points x k into the m subintervals of the ith
coordinate axis.
d. By considering {f(x k )\x k eX;j}(i=l, . . ., n; j=l, . . ., m) select the subinterval set X*
which is considered most likely to contain the global minimum. Briefly, this is done by selecting the
subinterval set for which the average functional value is smallest, being careful to avoid choices
based on insufficient information (for more details see Zakhorov [5]).
e. By redefining /., and L„ delete the subinterval sets X S j (J=l, . . ., m;j ¥^ t) from the remaining
feasible region. Delete each point x k in the list whose sth coordinate is in a deleted subinterval X S j.
Go to step b.
As the remaining feasible region S gradually shrinks, the global minimum will be more and more
closely bracketed. The problem with this method is that the most promising subinterval must be
determined on the basis of the sample of points x" chosen so far. There is always a chance that a sub
interval chosen for deletion will, in fact, contain the global minimum solution, and once it is deleted
it can never be recovered.
Strategy S6 attempts to solve this problem by retaining the entire region S throughout and using
a probabilistic allocation device to concentrate attention on areas in S which are most promising. This
algorithm is new and is still under development. Initial results show some promise, but considerable
improvement is still necessary.
572 J K. HARTMAN
STRATEGY S6 (Coordinatewise Allocation):
a. Define a marginal probability distribution function <I>, on the feasible interval [/,, £,,■] of each
coordinate axis i=l, . . . , n. In the absence of other information, a uniform distribution seems rea
sonable for the initial distribution.
b. Randomly choose TV points x k eS and compute/^) for each. The probability distribution func
tions <Pi govern these choices in that the ith component x k of x k is chosen as a random sample point
from the distribution <!>,•. Thus, the 4>; determine the allocation of trial points to various regions in S.
c. Based on the results of the trials to date, modify the <P, to increase the allocation of future
points to regions considered likely to contain the global minimum. Go to step b.
Strategy S6 can have many realizations depending on the method of handling step c. In the version of
S6 reported in this paper, step c is performed as following for each coordinate i= 1, . . . ,n.
1. The feasible interval [/,, Lj] is split into m subintervals.
2. A "success" is defined as a value of f(x k ) in the bottom 25 percent of aUf(x k ) values, and the
ratios ry of the number of successes in subintervaly of coordinate i to the total number of points in sub
interval j are computed for all i and j.
3. The modified probability for subinterval j of coordinate i is given by py = ry/ ^ ry the normalized
success ratio.
Several improvements on this allocation scheme are being considered for future testing.
In early tests it became apparent that performance of the various strategies fluctuated considerably,
depending on the particular test problem under investigation. For example, relative to the other strat
egies, S2 performed spectacularly on some problems but miserably on others. On closer examination it
was found that S2 did well on problems for which the global/value was significantly lower than the local
minima and for which the global region of attraction was quite large; that is, on problems which were
rather easy to solve. This suggests the need for a benchmark strategy to be used for assessing problem
difficulty. The benchmark strategy should have as little structure as possible. We have chosen to use
the pure random search method for this purpose.
STRATEGY SO (Pure Random Brooks [2]):
a. Set *=1.
b. Randomly select x k eS. Evaluate f(x k ).
c. Replace k with k + 1. Go to step b. At each stage retain the best/ value found to date.
This strategy may be regarded as a benchmark method since it makes no attempt to take advantage of
the information gathered at previous stages. In this sense it is probably the most primitive strategy
possible. We can use SO in two ways:
1. If a strategy does not do considerably better than SO, it should be discarded.
2. If a test problem is such that SO can solve it nearly as well as the other strategies, then the prob
lem is not very difficult and probably is not useful for discriminating among strategies.
IV. COMPUTATIONAL EXPERIMENTS
A number of computational experiments were performed to compare the various strategies pre
sented above. For each of the test functions employed, each strategy was run 30 times with different
GLOBAL OPTIMIZATION 573
random number sequences. A run was allowed to continue until the algorithm had required 1,000
evaluations of the objective function/(ac).
Test problems with predictable local and global solutions were constructed using the objective
function
j=m
This function consists of the superposition of m modes, where mode j has depth CjeE 1 , position PjtE",
and shape and width determined by the nX n negative definite matrix Aj, Particular test functions were
obtained by choosing the parameters Cj and Pj from a random number table. Aj was chosen to ensure that
the m modes were narrow enough that they did not completely merge into one another.
Strategies SI through S4 require an unconstrained minimizer. Since the purpose of the study is to
compare global strategies, a minimizer is desired which uses the same information as is available to the
other strategies — function values, but not derivatives. Powell's derivative free method was selected [4].
V. RESULTS
The computational results obtained are summarized in Tables 1 and 2. Table 1 gives characteristics
of the test problems used. Table 2 lists for each problem and for each strategy the best/value obtained
after 200, 500, and 1 ,000 function evaluations. Each value is the average of the 30 trials conducted for that
problem and strategy. The percentage of the 30 trials which did not locate the global minimum after
1,000 function evaluations is also given in Table 2. It is difficult to obtain a single measure of perform
ance for this kind of problem since we must balance speed of convergence against the chance that the
global solution will be missed entirely.
Table 1. Characteristics of Test Problems
Problem
Number of
variables
Number of
minima
Value of global
minimum
A
2
4
9.0
B
2
10
9.9
C
2
10
9.3
D
2
10
9.8
E
2
10
13.0
F
5
5
9.4
G
5
5
10.1
H
5
10
10.0
I
5
10
8.9
J
5
20
11.9
HARTMAN
Test Results
S2
S3
S4
S5
S6
,
9.0
8.2'
8.6
8.5
8.7
9.0
8.9
9.0
8.7
9.0
9.0
9.0
9.0
8.8
9.0
0.0
0.0
0.0
20.0
0.0
9.7
9.0
9.5
9.1
9.1
9.8
9.9
9.9
9.7
9.8
 9.9
9.9
9.8
9.9
0.0
0.0
0.0
10.0
0.0
8.3
8.1
8.8
7.8
7.8
7.7
6
8.2
9.1
8.5
8.1
8.0
 8.9
8.6
9.2
8.7
8.2
8.2
53.3
3.3
43.3
83.3
80.0
9.2
7.8
7.4
8.8
8.8
9.5
9.4
8.5
9.2
9.4
i
9.6
9.7
9.6
9.2
9.6
:
30.0
6.7
33.3
73.3
33.3
10.1
11.8
8.3
9.5
10.9
10.2
12.8
10.5
11.2
12.6
12.3
 12.7
12.9
12.0
13.0
12.7
12.8
3.3
30.0
0.0
6.7
3.3
5.0
6.4
5.8
0.8
0.8
.
7.9
5.0
8.0
8.7
2.9
3.1
8.7
5.6
8.5
8.9
7.0
7.5
86.7
43.3
33.3
80.0
76.7
7.4
7.3
7.1
7.5
5.0
4.7
8.8
9.7
9.7
8.3
8.2
10.0
9.1
9.9
10.1
9.5
9.3
56.7
10.0
0.0
16.7
40.0
7.0
6.8
7.4
3.7
3.6
8.3
7.3
8.7
9.2
6.3
7.2
8.9
7.7
9.2
9.7
8.2
8.9
93.3
56.7
20.0
60.0
50.0
•7.6
6.3
6.5
6.7
4.2
4.2
7.4
8.0
7.8
5.8
5.3
.'
7.6
8.4
8.6
6.9
6.1
10.0
66.7
33.3
36.7
80.0
100.0
'
6.3
6.7
6.5
3.8
3.6
•
8.8
6.6
7.4
8.1
5.3
4.6
9.7
7.2
8.8
8.3
7.4
6.5
■
83.3
66.7
76.7
73.3
90.0
ome general conclusions:
' very .challenging since SO did nearly as well as most other
hut frequently stops short of the global solution — it
GLOBAL OPTIMIZATION 575
3. In general, SI, S3, and S4 perform about the same and better than the other strategies.
4. S5 and S6 exhibit slow initial convergence. Both frequently tend to concentrate the search effort
around a good local minimum, which is not global.
5. On difficult problems even the best of these methods will frequently fail to locate the global
minimum.
It is also interesting to examine the entire graph of the number of function evaulations versus the
best function value obtained for each strategy. These curves are shown for test function H in Figure 1.
The results for function H are representative of those obtained for the other functions and serve to
emphasize conclusions 2, 3, and 4, above.
Note on this graph that S5 and S6 display a consistent decrease at an initial rate which is similar
to that of the better strategies S3 and S4. However, since they start much higher on the graph, S5
and S6 never catch up. This is inherent in the methods. Given any starting point x k , S3 and S4 imme
diately search to a local minimum, and thus quickly get a fairly low objective function value. Starting
from the same initial point, S5 and S6 merely note the objective value and proceed to check other points,
doing a global survey instead of a local minimization. Thus, in the initial stages, S5 and S6 are essen
tially identical to SO, pure random search. It is only after considerable information has been accumu
lated that these methods can concentrate their attention on promising search areas.
3.0
4.0
5.0
6.0 
7.0 ■ ■
8.0
9.0 
10.0
PERFORMANCE OF STRATEGIES ON
TEST FUNCTION H
(AVERAGE OF 30 TRIALS FOR EACH STRATEGY)
H —
100
+
o
o
o
o
o
o
+
"»
HXXXXxxxn XX
""""XXrrr
OOOOOOOOOOOOOOOooOb.
FUNCTION EVALUATIONS >
1 1 1 1
 S2
..S5
*"*x/xx/xTt«6
S3
'ooo
+
°°ooooooo S4
■+■
4
100 200 300 400 500 600 700 800 900 1000
FIGURE 1. Performance of strategies on test function H (average of 30 trials for each strategy).
576 J K. HARTMAN
A comparison of SO, Si, S2, S3, and S4 is also interesting. In general, it seems that in these strate
gies which alternate cycles of random searching with unconstrained minimizations, the best results
are obtained by the methods which do the least random searching. Thus, SO is purely random search,
and its performance is the worst. S2 requires several (perhaps many) random evaluations before each
minimization to find a point better than the current best local solution, and its performance is second
worst. Strategy SI selects one random point x k before each minimization while S3 selects one random
direction d k . Their performance is similar and almost as good as that of S4 which makes no random
selections between minimizations. This strongly suggests that an improved strategy will consist of
frequent minimizations coupled with an improved way of selecting starting points which are promising
and which also sample the entire region.
In conclusion, it is appropriate to note that these six methods do not come near to exhausting the
possible techniques for avoiding local solutions. Methods which are hybrids of these and entirely new
methods should be tested. In particular, we hope to develop an algorithm which allocates unconstrained
minimizations to various regions similar to the way strategy S6 allocates the individual points x k .
Such a method would combine the rapid local optimizing power of the minimization method with a
global analysis of the feasible region.
REFERENCES
1 1 1 Bocharov, N. and A. A. Feldbaum, "An Automatic Optimizer for the Search of the Smallest of
Several Minima," Automation and Remote Control Vol. 23, No. 3 (1962).
[2] Brooks, S. H., "A Discussion of Random Methods for Seeking Maxima," Operations Research 6,
244 (1958).
[3] Fiacco, A. V. and G. P. McCormick, Nonlinear Programming: Sequential Unconstrained Minimi
zation Techniques, John Wiley and Sons, Inc., New York, (1968).
[4] Powell, M. J. D., "An Efficient Method for Finding the Minimum of a Function of Several Variables
Without Calculating Derivatives," Computer Journal 7, 155 (1964).
[5] Zakharov, V. V., "A Random Search Method," Engineering Cybernetics, 2, 26 (1969)..
A NOTE ON
MATHEMATICAL PROGRAMMING WITH
FRACTIONAL OBJECTIVE FUNCTIONS
B. Mond
La Trobe University
Bundoora, Melbourne, Australia
and
B. D. Craven
University of Melbourne
Melbourne, Australia
INTRODUCTION
Consider the fractional programming problem with linear constraints. Problem 1 (PI):
(1) Maximize f(x)lg(x)
Subject to
(2) Ax^b
(3) x^O.
It is assumed that the problem is regular, i.e., that the constraint set is nonempty and bounded and that
/and g do not simultaneously become zero.
There has been a great deal of interest in various special cases of PI. In particular, if/ and g are
linear, Charnes and Cooper [1] showed that optimal solutions can be determined from optimal solutions
of two associated linear programming problems. Charnes and Cooper's result was extended to the ratio
of two quadratic functions by Swarup [3]. He considered Pi with / and g quadratic, and showed that
an optimal solution, if it exists, can be obtained from the solutions of two associated quadratic program
ming problems, each with linear constraints and one quadratic constraint. Sharma [2] considered
Pi with / and g polynomials. He showed that an optimal solution, if it exists, can be obtained from the
solutions of two associated programming problems where the objective function is a polynomial and
the constraints are all linear except for one polynomial constraint.
Here we consider a much wider class of functionals/and g, and obtain a theorem that includes as
special cases the corresponding results of [1], [2], and [3].
NOTATION AND DEFINITIONS
AeR 1 "*", xeR n , beR'", teR, /and g are mappings from R" into R. (f> denotes a monotone strictly
increasing function from R into R, with (f>(t ) > for t > 0.
577
578 B. MOND AND B. D. CRAVEN
For a specified function (/>, define the functions F and G, for real positive t and yeR", by
(4) F(y, t)=f(ylt)<f>(t) and G{y, t)=g(ytt)4>(t).
Define
(5) F(y, 0) = lim F(y, t) and G(y, 0) = lim G(y, t)
«»o «»o
whenever these limits exist. Assume that G(0, 0) =0 whenever it exists.
RESULTS
Let us introduce the transformation y=tx, where for specified function <b and nonzero constant
Ac/?, we require
(6) G(y,*) = A.
On multiplying numerator and denominator of (1) by 4>(t), and using (4) and (6), we obtain the asso
ciated nonlinear programming problem. Problem 2 (P2):
Maximize F(y, t),
Subject to
(7) Aybt^0
(8) G(y,t) = A
(9) y§0
(10) (§0
LEMMA: If the point (y, t) satisfies the constraints of P2, then t > 0.
REMARK: This only requires proof if G(y, 0) is defined, by (5). This is automatically the case
if /is affine and <\>{t) = t.
By (8), the point (0, 0) is not feasible for P2.
PROOF: Assume that the point (y, 0) is feasible for P2; then y^O. Let x be feasible for PL
Since the constraints are linear, x + ky is feasible for Pi for any positive A:, contradicting the bounded
ness of the constraint set of PL
THEOREM 1: If (i) < sgn A = sgn g(x*) for an optimal solution x* of Pi, and (ii) (y*, t*) is an
optimal solution of P2, then y*/t* is an optimal solution of PL
PROOF: Assume that the theorem is false, i.e., assume that there exists an optimal x* such that
(11) f(x*)lg(x*) >f(y*lt*)lg(y*lt*).
By condition (i),
g(x*)  0A for some 6 > 0.
FRACTIONAL OBJECTIVE PUNCH,
Consider t = <t>*(H0), and y=<f> 1 (l/b)x*. Then (j>(t)g(x*) = G(y,
(10), and also (7) since the constraint is linear. Thus (v. /) is a I
Now
(12)
Also
(13)
f(x*)lg(x*) = 4>{t)f(x*)lU >(*) {x*)]=F(y,t)l
f(y*lt*)lg(y*lt*) )f(y*lt*)l[<Ht*)gi
li*)IG(y*m
■■F(y*lt*)lb.
Hence, for feasible (y,t), (11), (12), and (13) show that F(y, t) >
tion that (y*,t*) is optimal for P2.
If sgn g(x*) < 0, for x* an optimal solution of PI, then repla
tional is unaltered and for the new denominator we have — g(x*) > 0.
Thus, if PI has a solution, it can be obtained by solving, foi
the two nonlinear programming problems. Problem 3 (P3):
■ ■ ■ ■
Subject to
(14)
Maximize F(y\ t)
Avbt^0
and Problem 4 (P4):
Subject to
(15)
G(y,t) = l
j20,fS0
Maximize ~F{y, t)
Aybt^0
G(y,0 = l
SPECIAL CASES
If/ and g are linear and 4>(t) — t, then our theorem gives thi
If/and g are quadratic functions and 4>{t) = t 2 , we obtain the result
nomials of degree m and <£>(') = t m , we obtain the result of Sharrna [2].
580 B. MOND AND B. D. CRAVEN
If / and g are homogeneous of degree k, and <b(t) = t k then F(y, t)—f(y), G(y, t) = g(y), and
P2 takes a simple form. An example is
f(x) = d t x+(x t Cxyi\
where C is a positive semidefinite matrix.
REMARKS
As noted in [2] and [3], even if G(y, t) is a convex function of y and t, the constraint set of P3 is
not necessarily convex. Instead of P3, therefore, it is sometimes more convenient to deal with the fol
lowing Problem 3' (P3'):
Maximize F(y, t)
Subject to Ay—bt^0
G(y, t) =§ 1
y^O.fSO.
If G(y, t) is a convex function of the vector variable (y, t) then the constraint set of P3' is convex.
It should be noted that even if g(x) is convex with respect to x, G(y, t) need not be convex with
respect to the vector variable (y, t). As an example, consider g(x) =x'Cx — k where C is a positive
semidefinite symmetric matrix and A: is a positive scalar. Thus g(x) is convex with respect to*. Taking
<b(t) = t 2 , G(y, t)=y'Cy—kt 2 which is not convex.
If (y*, t*) is optimal for P3', t* > 0, and G(y*, t*) = 0, then max Pi may be °°, since x* = y*/t*
is feasible for PI, and g(x*) = 0. If G(y*,t*) =Ai, where 0< Ai < oo, then (y*,t*) is also optimal for
P2 with A = Ai, so Theorem 1 applies. However, the optimum of P3' can occur at (y, t ) = (0, 0),
which does not correspond to an optimum of Pi. For example, if Pi is the program (for a real variable
*):
— x — 3
Pi: Maximize f{x) = — —r— subject to x ^ 2 and x i? 0, then taking <f)(t) = t, the corresponding
jc~t 1
P3' is:
P3': Maximize— y — 3* subject to y+t Sl,yg0,t50, y — 2t ^ 0. The maximum for P3' then occurs
at (y, t) = (0, 0) ; but the maximum for Pi occurs at x = 2.
Similarly, instead of (P4), it might be more convenient to consider Problem 4' (P4')
Maximize— F{y, t)
Subject to
Aybt^0
G(y,t)^l
yg0,(g0.
FRACTIONAL OBJECTIVE FUNCTIONS 581
If G is concave in the vector variable (y, t), the constraint set of P4' is convex.
REFERENCES
[1] Charnes, A. and W. W. Cooper, "Programming With Linear Fractional Functionals," Nav. Res.
Log. Quart. 9, 181186 (1962).
[2] Sharma, I. C, "Feasible Direction Approach to Fractional Programming Problems," Opsearch 4,
6172 (1967).
[3] Swarup, K., "Programming with Quadratic Fractional Functionals," Opsearch 2, 2330 (1965).
NEWS AND MEMORANDA
Mathematical Models of Target Coverage and Missile Allocation
The Military Operations Research Society announces that it now has copies of its first monograph,
"Mathematical Models of Target Coverage and Missile Allocation" by A. Ross Eckler and Stefan A.
Burr, available for sale. The book may be purchased for $7.50 postpaid by contacting: MORS, 101
South Whiting St., Alexandria, Va. 22304.
The monograph presents a comprehensive summary of analytical models primarily used for
problems in strategic defense but applicable to a wide variety of more generalized resource allocation
problems. The topics discussed include models of defended point targets, circular targets, gaussian
targets, generalized area targets, groups of identical targets, and nonidentical targets. Offense and
defense strategies are examined and under alternative assumptions of information available to both
sides. An extensive bibliography is included.
U.S. GOVERNMENT PRINTING OFFICE: 1973— 541 387: 1
583
INFORMATION FOR CONTRIBUTORS
The NAVAL RESEARCH LOGISTICS QUARTERLY is devoted to the dissemination of
scientific information in logistics and will publish research and expository papers, including those
in certain areas of mathematics, statistics, and economics, relevant to the overall effort to improve
the efficiency and effectiveness of logistics operations.
Manuscripts and other items for publication should be sent to The Managing Editor, NAVAL
RESEARCH LOGISTICS QUARTERLY, Office of Naval Research, Arlington, Va. 22217.
Each manuscript which is considered to be suitable material for the QUARTERLY is sent to one
or more referees.
Manuscripts submitted for publication should be typewritten, doublespaced, and the author
should retain a copy. Refereeing may be expedited if an extra copy of the manuscript is submitted
with the original.
A short abstract (not over 400 words) should accompany each manuscript. This will appear
at the head of the published paper in the QUARTERLY.
There is no authorization for compensation to authors for papers which have been accepted
for publication. Authors will receive 250 reprints of their published papers.
Readers are invited to submit to the Managing Editor items of general interest in the field
of logistics, for possible publication in the NEWS AND MEMORANDA or NOTES sections
of the QUARTERLY.
NAVAL RESEARCH
LOGISTICS
QUARTERLY
SEPTEMBER 197,'
VOL. 20, NO. 3
NAVSO P1278
CONTENTS
ARTICLES
Sequential Determination of Inspection Epochs for Reliability
Systems with General Lifetime Distributions
An Empirical Bayes Estimator for the Scale Parameter of the
TwoParameter Weibull Distribution
Optimal Allocation of Unreliable Components for Maximizing
Expected Profit Over Time
A Continuous Submarine Versus Submarine Game
Total Optimality of Incrementally Optimal Allocations
An Approach to the Allocation of Common Costs of Multi
Mission Systems
An Explicit General Solution in Linear Fractional Programming
Using Decomposition in Integer Programming
Numerical Treatment of a Class of SemiInfinite Programming
Problems
Min/Max Bounds for Dynamic Network Flows
ProductionAllocation Scheduling and Capacity Expansion
Using Network Flows Under Uncertainty
Concave Minimization over a Convex Polyhedron
Estimation of a Hidden Service Distribution of an M/G/oo
System
The Single Server Queue in Discrete Time Numerical Analysis
III
Some Experiments in Global Optimization
A Note on Mathematical Programming with Fractional Objec
tive Functions
News and Memoranda
S. ZACKS, 377
. W. J. FENSKE
G. K. BENNETT 31
C. G. HENIN
E. LANGFORD
L. D. STONE 419
R. T. CROW
A. CHARNES,
W. W. COOPER
L. SCHRAGE
S. A. GUSTAFSON,
K. O. KORTANEK
W. L. WILKINSON
J. PRAWDA
H. A. TAHA
L. L. GEORGE,
A. C. AGRAWAL
M. F. NEUTS,
E. KLIMKO
J. K. HARTMAN
B. MOND,
B. D. CRAVEN
OFFICE OF NAVAL RESEARCH
Arlington, Va. 22217